<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gianluca Fabrizi</title>
    <description>The latest articles on DEV Community by Gianluca Fabrizi (@gfabrizi).</description>
    <link>https://dev.to/gfabrizi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F329006%2F61a5e3e3-d307-4ac6-850b-6aa8b0dc1222.jpg</url>
      <title>DEV Community: Gianluca Fabrizi</title>
      <link>https://dev.to/gfabrizi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gfabrizi"/>
    <language>en</language>
    <item>
      <title>1BRC in PHP FFI + Rust</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Fri, 14 Feb 2025 09:13:00 +0000</pubDate>
      <link>https://dev.to/gfabrizi/1brc-in-php-ffi-rust-5ed9</link>
      <guid>https://dev.to/gfabrizi/1brc-in-php-ffi-rust-5ed9</guid>
      <description>&lt;p&gt;We have tried multi-threading in PHP to speed up execution time; the results are good, but far from perfect. Is there another way we can improve PHP's performance?&lt;/p&gt;

&lt;p&gt;In the previous post, we gave an overview of 1BRC, tried to push the limits of PHP when discussing performance optimization, and ran our best PHP script on an EC2 instance.&lt;/p&gt;

&lt;p&gt;The results were not bad, but not noteworthy either: 17.0636 seconds (the fastest Java code took 1.535 seconds).&lt;/p&gt;

&lt;p&gt;So what are we supposed to do? Call it a day and get on with our lives? No, obviously not!&lt;br&gt;
We could "cheat" our way to a better score, by abusing one of Python's winning strategies: letting external libraries do the heavy lifting job!&lt;/p&gt;
&lt;h2&gt;
  
  
  Foreign Function Interface
&lt;/h2&gt;

&lt;p&gt;One of the ways to optimize an interpreted language is by moving the slow operations in an external module, usually written in a low-level language.&lt;br&gt;
In PHP you can write system-wide modules and enable them in &lt;code&gt;php.ini&lt;/code&gt;; this is useful for generic functions or for code that is not specific to one application.&lt;br&gt;
Since version 7.4 PHP introduced a new feature: Foreign Function Interface (FFI).&lt;br&gt;
FFI is a method for calling external libraries in your PHP coding, without changing global PHP configuration.&lt;br&gt;
This method is more flexible than dealing with modules, but configuring it could be a bit daunting at first.&lt;/p&gt;

&lt;p&gt;Let's try to wrap a Rust solution of 1BRC in a PHP script (yes, ok, we are definitely cheating).&lt;/p&gt;
&lt;h2&gt;
  
  
  The Rust solution
&lt;/h2&gt;

&lt;p&gt;To keep things simple we need a Rust solution that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;it's fast&lt;/li&gt;
&lt;li&gt;it's written in a clear way&lt;/li&gt;
&lt;li&gt;it's composed of a few files&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There's no need to explain point &lt;code&gt;1&lt;/code&gt;; points &lt;code&gt;2&lt;/code&gt; and &lt;code&gt;3&lt;/code&gt; are needed because we are going to modify the code to make it work as a module.&lt;br&gt;
I love Rust, but I'm not a Rust programmer, so the simpler the code the better.  &lt;/p&gt;

&lt;p&gt;I choose the solution written by Flavio Bizzarri &lt;a href="https://github.com/newfla/1brc_rust" rel="noopener noreferrer"&gt;https://github.com/newfla/1brc_rust&lt;/a&gt;  &lt;/p&gt;
&lt;h2&gt;
  
  
  Compiling Rust module
&lt;/h2&gt;

&lt;p&gt;First of all, we clone the repository, then we edit the &lt;code&gt;Cargo.toml&lt;/code&gt; file to add some options:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[profile.release]&lt;/span&gt;
&lt;span class="py"&gt;lto&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;strip&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;panic&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"abort"&lt;/span&gt;
&lt;span class="py"&gt;debug&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="py"&gt;opt-level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="py"&gt;codegen-units&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="nn"&gt;[lib]&lt;/span&gt;
&lt;span class="py"&gt;crate-type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"cdylib"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"lib"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="py"&gt;bench&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the &lt;code&gt;[profile.release]&lt;/code&gt; section, we enabled additional performance optimizations (&lt;code&gt;debug&lt;/code&gt;, &lt;code&gt;opt-level&lt;/code&gt;, &lt;code&gt;codegen-units&lt;/code&gt;); we added the &lt;code&gt;[lib]&lt;/code&gt; section, where we specify that we want to compile the source as a &lt;code&gt;cdylib&lt;/code&gt; library (shared libraries that can be linked into external programs).  &lt;/p&gt;

&lt;p&gt;&lt;code&gt;main.rs&lt;/code&gt; file is used just to call &lt;code&gt;adv::process&lt;/code&gt;; we remove this file and add a &lt;code&gt;run()&lt;/code&gt; method in &lt;code&gt;lib.rs&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[no_mangle]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;c_char&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="nb"&gt;c_char&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;c_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nd"&gt;assert!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="nf"&gt;.is_null&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
        &lt;span class="nn"&gt;CStr&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_ptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;c_str&lt;/span&gt;&lt;span class="nf"&gt;.to_str&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nn"&gt;adv&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;#[no_mangle]&lt;/code&gt; disables the mangle (in short: it keeps the function's name in the exported library) and marks this function as "to export".  &lt;/p&gt;

&lt;p&gt;We are cheating, but in a responsible way 😅: from PHP code, we pass the weather data filename to the Rust module. Then the Rust module returns the station's aggregated data to be displayed.&lt;br&gt;
PHP is a loosely-typed language, while Rust is a strongly-typed language, so moving data between the two can be a bit of a challenge-in-the-challenge. We need &lt;code&gt;libc&lt;/code&gt; crate and &lt;code&gt;ffi::CStr&lt;/code&gt; from &lt;code&gt;std&lt;/code&gt;.  &lt;/p&gt;

&lt;p&gt;The code needed to convert from PHP String to Rust string slice has been taken from &lt;em&gt;&lt;a href="https://jakegoulding.com/rust-ffi-omnibus/string_arguments/" rel="noopener noreferrer"&gt;"The Rust FFI Omnibus"&lt;/a&gt;&lt;/em&gt;; using it's words:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Getting a Rust string slice (&amp;amp;str) requires a few steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;We need to make sure that the C pointer is not &lt;code&gt;NULL&lt;/code&gt; as Rust references are not allowed to be &lt;code&gt;NULL&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use &lt;code&gt;std::ffi::CStr&lt;/code&gt; to wrap the pointer. &lt;code&gt;CStr&lt;/code&gt; will compute the string's length based on the terminating &lt;code&gt;NULL&lt;/code&gt;. This requires an &lt;code&gt;unsafe&lt;/code&gt; block as we will be dereferencing a raw pointer, which the Rust compiler cannot verify meets all the safety guarantees so the programmer must do it instead.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure the C string is valid UTF-8 and convert it to a Rust string slice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use the string slice.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;In &lt;code&gt;adv.rs&lt;/code&gt; we use this code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;json_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;CString&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;serde_json&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;to_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cities&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="n"&gt;json_string&lt;/span&gt;&lt;span class="nf"&gt;.into_raw&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to return a JSON string to the PHP script.  &lt;/p&gt;

&lt;p&gt;That's it for Rust; we can compile the library with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="n"&gt;cargo&lt;/span&gt; &lt;span class="n"&gt;build&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;release&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The PHP script
&lt;/h2&gt;

&lt;p&gt;On the PHP side first of all we need a class to manage the input and output of th Rust module. Let's create a file called &lt;code&gt;libonebrc.php&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;
&lt;span class="k"&gt;final&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LibOneBrc&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nv"&gt;$ffi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;is_null&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nv"&gt;$ffi&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nv"&gt;$ffi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;FFI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;cdef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"char* run(const char* str);"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"rust/libonebrc.so"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nv"&gt;$resultPtr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nv"&gt;$ffi&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$filename&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;FFI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$resultPtr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The constructor's code uses &lt;code&gt;FFI::cdef()&lt;/code&gt; to import the Rust function from the &lt;code&gt;rust/libonebrc.so&lt;/code&gt; file.&lt;br&gt;
Here we have to declare the extern function's signature using C code, so the Rust &lt;code&gt;c_char&lt;/code&gt; parameters, become &lt;code&gt;char*&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NOTE&lt;/strong&gt;: it's also possible to use a &lt;code&gt;.h&lt;/code&gt; header file to specify the function(s) that PHP needs to know about; since we only need one simple function, it is easier to declare it inline in PHP code.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;run()&lt;/code&gt; method invokes the &lt;code&gt;run&lt;/code&gt; method of the Rust module (&lt;code&gt;self::$ffi-&amp;gt;run($filename)&lt;/code&gt;). We called both this wrapper method and the Rust function with the same name (&lt;code&gt;run()&lt;/code&gt;); this is only a coincidence (...or lack of fantasy); it's not mandatory.&lt;br&gt;
&lt;code&gt;FFI::string&lt;/code&gt; converts the pointer to a String usable in PHP.&lt;/p&gt;

&lt;p&gt;We also need an &lt;code&gt;index.php&lt;/code&gt; file to instantiate this &lt;code&gt;LibOneBrc&lt;/code&gt; class and to print the results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="k"&gt;require_once&lt;/span&gt; &lt;span class="s2"&gt;"libonebrc.php"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$libOneBrc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LibOneBrc&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nv"&gt;$filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./rust/measurements.txt"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nv"&gt;$result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$libOneBrc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$filename&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"{"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nv"&gt;$isFirstRow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$result&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$key&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$isFirstRow&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;","&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$isFirstRow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'='&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'/'&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s1"&gt;'/'&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nb"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="s2"&gt;"}"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nothing interesting here: we call our &lt;code&gt;run()&lt;/code&gt; method, passing it the measurements filename.&lt;br&gt;
The JSON string from Rust contains temperatures as integers, so we need to divide them by 10 and calculate the average temperature for each station.&lt;/p&gt;
&lt;h2&gt;
  
  
  The benchmark
&lt;/h2&gt;

&lt;p&gt;Let's run this code on the EC2 instance. The configuration is the same as last time: an &lt;code&gt;m6a.8xlarge&lt;/code&gt; with 32 vCPUs and 128GB of memory. For the hard disk, I opted for a 200GB io1 volume (to reach 10,000 IOPS).&lt;/p&gt;

&lt;p&gt;We run it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;perf &lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; 1B-ffi.log &lt;span class="nt"&gt;-r&lt;/span&gt; 10 &lt;span class="nt"&gt;-d&lt;/span&gt; php app/index.php
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and these are the results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Performance counter stats for 'php app/index.php' (10 runs):

          58802.93 msec task-clock                       #   29.718 CPUs utilized            ( +-  0.26% )
              4736      context-switches                 #   80.191 /sec                     ( +-  3.80% )
                57      cpu-migrations                   #    0.965 /sec                     ( +- 13.37% )
             52703      page-faults                      #  892.378 /sec                     ( +-  1.33% )
   &amp;lt;not supported&amp;gt;      cycles                                                      
   &amp;lt;not supported&amp;gt;      instructions                                                
   &amp;lt;not supported&amp;gt;      branches                                                    
   &amp;lt;not supported&amp;gt;      branch-misses                                               
   &amp;lt;not supported&amp;gt;      L1-dcache-loads                                             
   &amp;lt;not supported&amp;gt;      L1-dcache-load-misses                                       
   &amp;lt;not supported&amp;gt;      LLC-loads                                                   
   &amp;lt;not supported&amp;gt;      LLC-load-misses                                             

            1.9787 +- 0.0197 seconds time elapsed  ( +-  1.00% )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;1.9787 seconds! 🥳 🎉&lt;br&gt;&lt;br&gt;
This is a surprising result, considering the overhead of calling an external module and the fact that we are still making some calculations on the PHP side of the app.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;After this 2-parts-journey we can affirm that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;PHP is slow, but the performance improves significantly when using threads&lt;/li&gt;
&lt;li&gt;Performance tuning is a game of trade-offs: you can improve the speed of a task by saturating all the CPU cores, but your system will become unresponsive. In PHP this is a problem if your application needs to accept more than one connection at a time&lt;/li&gt;
&lt;li&gt;For heavy tasks, you can delegate to optimized external libraries&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The full code is available on Github:&lt;br&gt;
&lt;a href="https://github.com/gfabrizi/1brc-php-ffi" rel="noopener noreferrer"&gt;https://github.com/gfabrizi/1brc-php-ffi&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I hope you enjoyed the post!&lt;/p&gt;

</description>
      <category>php</category>
      <category>ffi</category>
      <category>rust</category>
      <category>performance</category>
    </item>
    <item>
      <title>1 Billion Rows Challenge in PHP</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Fri, 31 Jan 2025 09:09:00 +0000</pubDate>
      <link>https://dev.to/gfabrizi/1-billion-rows-challenge-in-php-5bpe</link>
      <guid>https://dev.to/gfabrizi/1-billion-rows-challenge-in-php-5bpe</guid>
      <description>&lt;p&gt;You've probably heard of the 1BRC (1 Billion Rows Challenge).&lt;br&gt;&lt;br&gt;
In a nutshell, it's a challenge (originally aimed at Java programmers) to write code that can solve a data calculation and aggregation problem on a file with 1 billion rows in the fastest way possible.&lt;/p&gt;

&lt;p&gt;While we can all agree that it is not indicative of the quality of the language or the typical usage scenarios, it is an activity that leads to a better understanding of the programming language being used and to the discovery of more powerful techniques.&lt;/p&gt;

&lt;p&gt;The challenge was very successful and many have tried to transfer the concept to other programming languages.&lt;br&gt;&lt;br&gt;
Today, we will try to tackle the challenge in the language that, at first glance, seems to have the least to say: PHP.&lt;/p&gt;

&lt;p&gt;Let's start by reading the task specifications:&lt;br&gt;&lt;br&gt;
The file with the data (measurements.txt) contains one billion rows. Each row represents the temperature recorded in a weather station and is composed of two fields: the station name and the detected temperature, separated by the &lt;code&gt;;&lt;/code&gt; character.&lt;br&gt;&lt;br&gt;
Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
St. John's;15.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The task is to read all the lines and calculate the minimum, average and maximum temperatures for each weather station, rounded to one decimal place.&lt;br&gt;&lt;br&gt;
Finally, you have to display this data on the screen in alphabetical order in the format &lt;code&gt;&amp;lt;station name&amp;gt;:&amp;lt;minimum temperature&amp;gt;/&amp;lt;average temperature&amp;gt;/&amp;lt;maximum temperature&amp;gt;&lt;/code&gt;.&lt;br&gt;&lt;br&gt;
Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, Accra=-10.1/26.4/66.4, Addis Ababa=-23.7/16.0/67.0, Adelaide=-27.8/17.3/58.5, ...}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The winning script took 1,535 seconds to finish; the top 5 finished under 2 seconds.&lt;br&gt;&lt;br&gt;
Getting to this point in PHP seems daunting, but let's see how far we can go!&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;DISCLAIMER:&lt;/em&gt;&lt;/strong&gt; I will not be using advanced debugging and profiling tools to determine where to optimize during testing, that's not the purpose of this article.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Initial Specifications
&lt;/h2&gt;

&lt;p&gt;We need to find a way to measure the performance of a PHP script. This is where the first problems come in:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How can we measure the execution time?&lt;/li&gt;
&lt;li&gt;How much data should be passed to the script to ensure reliable results?&lt;/li&gt;
&lt;li&gt;Where should we run the script? Should we do it locally or elsewhere (a dedicated server, a VM, etc...)?
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Measuring the execution time
&lt;/h3&gt;

&lt;p&gt;All Linux installations should have the command &lt;code&gt;time&lt;/code&gt;; by placing it before a command, the operating system can return the execution time of the command passed.&lt;br&gt;
This would seem to be the ideal solution, but there is a problem: it is not very precise, especially if we make it profile only one execution of the PHP script.&lt;br&gt;
A better approach is to use the command &lt;code&gt;perf&lt;/code&gt; and pass it the option &lt;code&gt;-r&lt;/code&gt;, followed by the number of times you want to run the command to profile.&lt;br&gt;
Example: &lt;code&gt;perf -r 10 my_command&lt;/code&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How much data should be passed to the script
&lt;/h3&gt;

&lt;p&gt;Running the PHP script on a billion rows could take a long time; we can use a small set of 1 million rows to start doing some tests. Then we can gradually increase the number of rows until we reach 1 billion.&lt;/p&gt;
&lt;h3&gt;
  
  
  Where to run the script
&lt;/h3&gt;

&lt;p&gt;Here, just like before, we can take a step-by-step approach. First we can run the benchmark locally on our computer, to see how the different versions compare.&lt;br&gt;
There's one important thing to remember: close all non-essential programs and make sure that the tests are always run under the same conditions.&lt;br&gt;
The version of PHP locally installed is &lt;code&gt;8.3.14&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then, when we have an optimized enough PHP script, we can move on to a dedicated server or a virtual machine.&lt;br&gt;
We will need a dedicated server for a short time; the best option would be to use a VM or a cloud instance.  &lt;/p&gt;
&lt;h2&gt;
  
  
  Writing the code
&lt;/h2&gt;
&lt;h3&gt;
  
  
  First attempt
&lt;/h3&gt;

&lt;p&gt;Let's start by writing a solution in PHP in the simplest way we can think of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="nv"&gt;$fp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;fopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'data/measurements.txt'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'r'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nv"&gt;$stations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;fgets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;';'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$stations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$stations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$stations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;fclose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nv"&gt;$results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$stations&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$key&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;array_sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;ksort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'{'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$name&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nv"&gt;$temps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'='&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;number_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$temps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'}'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;PHP_EOL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We open the file &lt;code&gt;data/measurements.txt&lt;/code&gt; for reading, populate an array with this data, and finally calculate the minimum, maximum and average temperature for each weather station.&lt;br&gt;
We sort the results alphabetically and print them on the screen.&lt;/p&gt;

&lt;p&gt;We have already created the file &lt;code&gt;measurements.txt&lt;/code&gt; with 1 million lines using the semi-official tool &lt;a href="https://github.com/gunnarmorling/1brc/blob/main/src/main/python/create_measurements.py" rel="noopener noreferrer"&gt;create_measurements.py&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 create_measurements.py 1_000_000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can then launch the first PHP script using the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;perf &lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; results/calculate_1.log &lt;span class="nt"&gt;-r&lt;/span&gt; 10 &lt;span class="nt"&gt;-d&lt;/span&gt; php src/calculate_1.php
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-o&lt;/code&gt; option specifies where to save the log with the execution results.&lt;br&gt;
To have reliable data we run the script 10 times (with the &lt;code&gt;-r&lt;/code&gt; option).&lt;/p&gt;

&lt;p&gt;It took &lt;strong&gt;1.066&lt;/strong&gt; seconds on the laptop I use for writing (Intel N5100 CPU with 8GB of RAM).&lt;br&gt;
We don't have enough information to understand whether it is a lot or a little.&lt;/p&gt;
&lt;h3&gt;
  
  
  Second attempt: &lt;code&gt;trim()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;From the screen output we see some strange line breaks that shouldn't be there. Let's try to apply a &lt;code&gt;trim()&lt;/code&gt; on the temperatures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$stations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We rerun the script and it takes &lt;strong&gt;1.048&lt;/strong&gt; seconds. The improvement is minimal, but the screen output no longer shows those extra line breaks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Third attempt: &lt;code&gt;(float)&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The “solution” of using &lt;code&gt;trim()&lt;/code&gt; does not seem to be the best; a quick analysis of the code shows that PHP stores the temperatures in the array as strings and not as floats. Let's try casting to &lt;code&gt;float&lt;/code&gt; instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$stations&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The execution time is &lt;strong&gt;0.62&lt;/strong&gt; seconds, a big improvement over the previous attempt!&lt;/p&gt;

&lt;p&gt;Perhaps we can also increase the number of rows from 1 million to 10 million to better appreciate the timing variations.&lt;br&gt;
Let's try running the same script on 10 million rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes) in src/calculate_3.php on line 13
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PHP's current memory limit is 128MB; the script tries to create an array with all the values of the file measurements.txt, exhausting the 128MB.&lt;br&gt;
Now we could raise that limit, or rewrite the code that takes care of aggregating and processing the data.&lt;br&gt;
Let's proceed with the rewriting of the code avoiding putting the entire input file in memory.&lt;/p&gt;
&lt;h3&gt;
  
  
  Fourth attempt: &lt;code&gt;while&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Let's try to read the data and process it in a single &lt;code&gt;while&lt;/code&gt; loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;fgets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;';'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way we don't saturate the memory with the contents of the &lt;code&gt;measurements.txt&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;Execution time &lt;strong&gt;on 10 million rows&lt;/strong&gt;: &lt;strong&gt;6.14&lt;/strong&gt; seconds&lt;br&gt;
This time doesn't tell us anything about a possible improvement/deterioration of performance; let's try with the 1 million row dataset.&lt;/p&gt;

&lt;p&gt;Execution time &lt;strong&gt;on 1 million rows&lt;/strong&gt;: &lt;strong&gt;0.76&lt;/strong&gt; seconds&lt;br&gt;
The time got worse (as expected); we traded the lower RAM usage for a higher execution time.&lt;/p&gt;
&lt;h3&gt;
  
  
  Fifth attempt: &lt;code&gt;min()/max()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This time we try to use the native PHP functions &lt;code&gt;min()&lt;/code&gt; and &lt;code&gt;max()&lt;/code&gt; to save the minimum and maximum value of each station:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;6.01&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;It's a small improvement, but an improvement nonetheless.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sixth attempt: &lt;code&gt;if&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Perhaps we can simplify those two lines of code even further, using two simple &lt;code&gt;if&lt;/code&gt; statements:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;4.90&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;Here we are, another important step in the right direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Seventh attempt: &lt;code&gt;!isset()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The situation inside the &lt;code&gt;while&lt;/code&gt; loop is this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are used to thinking of that &lt;code&gt;!isset()&lt;/code&gt; as if it were a single statement; in reality, however, there are two: there is the not (&lt;code&gt;!&lt;/code&gt;) and the &lt;code&gt;isset()&lt;/code&gt;.&lt;br&gt;
Let's try to invert the two branches of the &lt;code&gt;if&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;4.90&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;It's pretty much the same time as the previous attempt...&lt;/p&gt;

&lt;h3&gt;
  
  
  Eighth attempt: pointer &lt;code&gt;&amp;amp;&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Another attempt we can make inside the &lt;code&gt;if&lt;/code&gt; is to use a pointer to the array element that contains the weather station instead of calling &lt;code&gt;$results[$station][0]&lt;/code&gt; every time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$resultStation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nv"&gt;$results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$station&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;4.65&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;There is an improvement in timing, we keep this change.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ninth attempt: &lt;code&gt;fgets()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Let's focus for a moment on the &lt;code&gt;while&lt;/code&gt; that takes care of looping the entire &lt;code&gt;measurements.txt&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;fgets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why is there a &lt;code&gt;4096&lt;/code&gt; as the second parameter of the &lt;code&gt;fgets()&lt;/code&gt; function?&lt;br&gt;
I saw it in the example code on the PHP documentation page, so I just mindlessly copied it into the script. However, I later checked the documentation and found that the second parameter "limits" lines that are too long.&lt;br&gt;
Some comments on the PHP documentation page suggest there may be a performance issue when passing this second parameter (which is optional by the way).&lt;br&gt;
Let's try removing it and see what happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;fgets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fp&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;4.39&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;...this confirms the fact that reading the documentation carefully is always useful&lt;/p&gt;

&lt;h3&gt;
  
  
  Tenth attempt: &lt;code&gt;strtok()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The only point in the code that raises some doubts is the &lt;code&gt;explode()&lt;/code&gt; used to split the weather station's name from the temperature.&lt;br&gt;&lt;br&gt;
Let's try replacing it with &lt;code&gt;strtok()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$station&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;strtok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;';'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;strtok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;';'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Execution time on 10 million rows: &lt;strong&gt;3.82&lt;/strong&gt; seconds&lt;/p&gt;

&lt;p&gt;With this latest version of the script, it seems that we have reached the end of the optimizations, there are no further points to optimize.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test on dedicated hardware
&lt;/h2&gt;

&lt;p&gt;Now that we have our most performant script, we can go ahead and run it on a dedicated machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test on EC2 with 1 billion rows
&lt;/h3&gt;

&lt;p&gt;Let's try to run this script on hardware similar to the one used in the official Java challenge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU: AMD EPYC 7502P 32 cores / 64 threads @ 2.5 GHz&lt;/li&gt;
&lt;li&gt;Memory: 128 GB ECC DDR4 RAM&lt;/li&gt;
&lt;li&gt;2x SAMSUNG MZQL2960HCJR-00A07, 1 TB, Software RAID-1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The EC2 instance that most closely resembles it (and is cheaper) is the m6a.8xlarge with its 32 vCPUs and 128GB of memory. For the hard disk I opted for a 200GB io1 volume (to reach 10,000 IOPS).&lt;/p&gt;

&lt;p&gt;I tried to launch the last script on the EC2 instance; this time I ran the script only 2 times.&lt;br&gt;
The result was this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Performance counter stats for 'php src/calculate_10.php' (2 runs):

         261924.05 msec task-clock                       #    1.007 CPUs utilized            ( +-  0.72% )
               813      context-switches                 #    3.127 /sec                     ( +- 28.17% )
                 2      cpu-migrations                   #    0.008 /sec                   
              1486      page-faults                      #    5.715 /sec                   
   &amp;lt;not supported&amp;gt;      cycles                                                      
   &amp;lt;not supported&amp;gt;      instructions                                                
   &amp;lt;not supported&amp;gt;      branches                                                    
   &amp;lt;not supported&amp;gt;      branch-misses                                               
   &amp;lt;not supported&amp;gt;      L1-dcache-loads                                             
   &amp;lt;not supported&amp;gt;      L1-dcache-load-misses                                       
   &amp;lt;not supported&amp;gt;      LLC-loads                                                   
   &amp;lt;not supported&amp;gt;      LLC-load-misses                                             

            260.06 +- 1.89 seconds time elapsed  ( +-  0.73% )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;260.06 seconds, or 4 minutes and 20 seconds&lt;/strong&gt;... a truly disastrous result.&lt;/p&gt;

&lt;p&gt;In a previous attempt (the fourth) we had to change the way the data was aggregated, due to a PHP out-of-memory error; we had seen a significant performance degradation due to this change.&lt;br&gt;
Since the EC2 instance has a lot of RAM available, let's try to resume the script from the third attempt by applying the changes made from the fifth attempt onwards and raising the &lt;code&gt;memory_limit&lt;/code&gt; in the &lt;code&gt;php.ini&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;The results of this test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Performance counter stats for 'php src/calculate_in_memory.php' (2 runs):

         259240.23 msec task-clock                       #    0.992 CPUs utilized            ( +-  0.81% )
              1348      context-switches                 #    5.158 /sec                     ( +- 14.73% )
                 5      cpu-migrations                   #    0.019 /sec                     ( +- 10.00% )
            120810      page-faults                      #  462.253 /sec                     ( +-  0.26% )
   &amp;lt;not supported&amp;gt;      cycles                                                      
   &amp;lt;not supported&amp;gt;      instructions                                                
   &amp;lt;not supported&amp;gt;      branches                                                    
   &amp;lt;not supported&amp;gt;      branch-misses                                               
   &amp;lt;not supported&amp;gt;      L1-dcache-loads                                             
   &amp;lt;not supported&amp;gt;      L1-dcache-load-misses                                       
   &amp;lt;not supported&amp;gt;      LLC-loads                                                   
   &amp;lt;not supported&amp;gt;      LLC-load-misses                                             

            261.38 +- 2.10 seconds time elapsed  ( +-  0.80% )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;261.38&lt;/strong&gt; seconds, one second slower than the previous version.&lt;br&gt;
Probably the difference is irrelevant on such high-performance hardware.&lt;/p&gt;
&lt;h3&gt;
  
  
  Multi-thread PHP
&lt;/h3&gt;

&lt;p&gt;The only thing left to try is to write a PHP script that takes advantage of the interpreter's multithread capabilities.&lt;br&gt;
This task turned out to be quite complex for two reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The PHP interpreter must be compiled with the ZTS (Zend Thread Safe) option to launch parallel executions; few Linux distributions provide an interpreter with this feature turned on;&lt;/li&gt;
&lt;li&gt;The PHP script needs to be completely rewritten to take advantage of thread concurrency;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the first point, the only possibility is to compile PHP from sources with the ZTS option active, and then install the PECL &lt;code&gt;parallel&lt;/code&gt; module.&lt;br&gt;
On Debian, this is possible using the commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install build tools and libraries needed&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nb"&gt;install &lt;/span&gt;build-essential autoconf libtool bison re2c pkg-config git libxml2-dev libssl-dev

&lt;span class="c"&gt;# Clone and build a stripped-down version of PHP with ZTS support&lt;/span&gt;
git clone https://github.com/php/php-src.git &lt;span class="nt"&gt;--branch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;PHP-8.4.3 &lt;span class="nt"&gt;--depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;cd &lt;/span&gt;php-src/
./buildconf
./configure &lt;span class="nt"&gt;--prefix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/opt/php8.4-zts &lt;span class="nt"&gt;--with-config-file-path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/opt/php8.4-zts/etc/php &lt;span class="nt"&gt;--disable-all&lt;/span&gt; &lt;span class="nt"&gt;--disable-ipv6&lt;/span&gt; &lt;span class="nt"&gt;--disable-cgi&lt;/span&gt; &lt;span class="nt"&gt;--disable-phpdbg&lt;/span&gt; &lt;span class="nt"&gt;--enable-zts&lt;/span&gt; &lt;span class="nt"&gt;--enable-xml&lt;/span&gt; &lt;span class="nt"&gt;--with-libxml&lt;/span&gt; &lt;span class="nt"&gt;--with-pear&lt;/span&gt; &lt;span class="nt"&gt;--with-openssl&lt;/span&gt;
make &lt;span class="nt"&gt;-j32&lt;/span&gt;
./sapi/cli/php &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;make &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# Install `parallel` module from PECL&lt;/span&gt;
&lt;span class="nb"&gt;sudo&lt;/span&gt; /opt/php8.4-zts/bin/pecl channel-update pecl.php.net
&lt;span class="nb"&gt;sudo&lt;/span&gt; /opt/php8.4-zts/bin/pecl &lt;span class="nb"&gt;install &lt;/span&gt;parallel
&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /opt/php8.4-zts/etc/php/conf.d
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'extension=parallel.so'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /opt/php8.4-zts/etc/php/php.ini
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'memory_limit=-1'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt; /opt/php8.4-zts/etc/php/php.ini

&lt;span class="c"&gt;# Verify module installation&lt;/span&gt;
/opt/php8.4-zts/bin/php &lt;span class="nt"&gt;-i&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;parallel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, we downloaded and compiled the &lt;code&gt;8.4.3&lt;/code&gt; version of PHP.&lt;br&gt;&lt;br&gt;
The ZTS version of PHP can be run with the command &lt;code&gt;/opt/php8.4-zts/bin/php&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For the second point (rewriting the PHP script for multithreading), you can take inspiration from the PHP documentation and solutions to the 1BRC challenge available on the Internet.&lt;br&gt;
The one I took heavily from is &lt;a href="https://github.com/realFlowControl/1brc/blob/main/calculateAverage.php" rel="noopener noreferrer"&gt;this one&lt;/a&gt;.&lt;br&gt;
The overhead from multithread management mostly comes from having to cycle the &lt;code&gt;measurements.txt&lt;/code&gt; file first to split it into chunks that match the number of cores on the machine where the script is running. Each thread will process one of these chunks that, combined, will lead to the final result.&lt;/p&gt;

&lt;p&gt;The source code is fully available in the GitHub repository.&lt;/p&gt;

&lt;p&gt;The results on the EC2 instance are:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Performance counter stats for '/opt/php8.4-zts/bin/php src/zts/zts_calculate_1.php 32' (10 runs):

         521063.71 msec task-clock                       #   30.537 CPUs utilized            ( +-  0.17% )
              4416      context-switches                 #    8.521 /sec                     ( +- 12.71% )
                25      cpu-migrations                   #    0.048 /sec                     ( +- 27.73% )
             41060      page-faults                      #   79.228 /sec                     ( +-  0.21% )
   &amp;lt;not supported&amp;gt;      cycles                                                      
   &amp;lt;not supported&amp;gt;      instructions                                                
   &amp;lt;not supported&amp;gt;      branches                                                    
   &amp;lt;not supported&amp;gt;      branch-misses                                               
   &amp;lt;not supported&amp;gt;      L1-dcache-loads                                             
   &amp;lt;not supported&amp;gt;      L1-dcache-load-misses                                       
   &amp;lt;not supported&amp;gt;      LLC-loads                                                   
   &amp;lt;not supported&amp;gt;      LLC-load-misses                                             

           17.0636 +- 0.0181 seconds time elapsed  ( +-  0.11% )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;17.0636&lt;/strong&gt; seconds... an impressive result considering the previous 260 seconds! &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;For now, we'll pause here; we're left with a time of 17.06 seconds for the execution on 1 billion rows.&lt;br&gt;&lt;br&gt;
In the next article, we'll see another way to face this challenge with PHP.&lt;br&gt;
I'll leave you with the summary of the test results on the EC2 instance. The first 9 scripts were run on 1 and 10 million rows dataset. Script 10, the one with all the data in RAM (&lt;code&gt;calculate_in_memory.php&lt;/code&gt;), and the ZTS script were run on a 1 billion rows dataset.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgatjo5xgf6953orzywu3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgatjo5xgf6953orzywu3.png" alt="Image description" width="784" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;NOTE:&lt;/em&gt;&lt;/strong&gt; The 1M and 1M lines tests were run with PHP &lt;code&gt;8.4.2&lt;/code&gt;; the 1B lines tests were run with PHP &lt;code&gt;8.4.3&lt;/code&gt;.  &lt;/p&gt;

&lt;p&gt;The full code is available on Github:&lt;br&gt;
&lt;a href="https://github.com/gfabrizi/1brc-php" rel="noopener noreferrer"&gt;https://github.com/gfabrizi/1brc-php&lt;/a&gt;&lt;/p&gt;

</description>
      <category>php</category>
      <category>1brc</category>
      <category>performance</category>
    </item>
    <item>
      <title>Getting started with Cypher and RedisGraph / Part III</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Mon, 29 May 2023 08:16:00 +0000</pubDate>
      <link>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-part-iii-30da</link>
      <guid>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-part-iii-30da</guid>
      <description>&lt;p&gt;Welcome to the last part of the "Getting started with Cypher and RedisGraph" series.&lt;br&gt;
In the first post we looked at what a Graph DB is and why should you use one; the second post explained the basics of querying RedisGraph.&lt;/p&gt;

&lt;p&gt;Now we will build a very small example to show everything we learned in the previous posts.&lt;/p&gt;

&lt;p&gt;First of all, start redis Insight with &lt;code&gt;docker compose up -d&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At this point you need python3 installed on your local machine; if you don't, check online the instructions for you OS.&lt;br&gt;
You also need click and redis python extensions; you can install them with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;click redis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that you have everything, you can clone the demo repository&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/gfabrizi/redisgraph-demo.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;enter the cloned directory and launch the data import:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 bulk_insert.py Routes &lt;span class="nt"&gt;-n&lt;/span&gt; City.csv &lt;span class="nt"&gt;-r&lt;/span&gt; ROUTE.csv &lt;span class="nt"&gt;-h&lt;/span&gt; 127.0.0.1 &lt;span class="nt"&gt;-p&lt;/span&gt; 6379
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you should see a message like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;b&lt;span class="s1"&gt;'City'&lt;/span&gt;  &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="c"&gt;####################################]  100%&lt;/span&gt;
19 nodes created with label &lt;span class="s1"&gt;'b'&lt;/span&gt;City&lt;span class="s1"&gt;''&lt;/span&gt;
b&lt;span class="s1"&gt;'ROUTE'&lt;/span&gt;  &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="c"&gt;####################################]  100%&lt;/span&gt;
81 relations created &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="s1"&gt;'b'&lt;/span&gt;ROUTE&lt;span class="s1"&gt;''&lt;/span&gt;
Construction of graph &lt;span class="s1"&gt;'Routes'&lt;/span&gt; &lt;span class="nb"&gt;complete&lt;/span&gt;: 19 nodes created, 81 relations created &lt;span class="k"&gt;in &lt;/span&gt;0.006071 seconds
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What just happened?
&lt;/h2&gt;

&lt;p&gt;we used a script (&lt;code&gt;bulk_insert.py&lt;/code&gt;) from the awesome repository &lt;code&gt;redis-dataset&lt;/code&gt;:&lt;br&gt;
&lt;a href="https://github.com/redis-developer/redis-datasets" rel="noopener noreferrer"&gt;https://github.com/redis-developer/redis-datasets&lt;/a&gt;&lt;br&gt;
The script is used to import nodes (&lt;code&gt;-n&lt;/code&gt; parameter) and relationships (&lt;code&gt;-r&lt;/code&gt; parameter). You can specify more than one nodes and relationships file.&lt;br&gt;
We imported nodes representing cities (&lt;code&gt;City.csv&lt;/code&gt;) and relations (&lt;code&gt;ROUTE.csv&lt;/code&gt;); the import script uses the csv filenames to give name to nodes and relations.&lt;br&gt;
So now we have our graph "Routes" filled with 19 cities and 1786 routes.&lt;/p&gt;

&lt;p&gt;It's time to open RedisInsight UI in the browser and start querying!&lt;/p&gt;
&lt;h2&gt;
  
  
  Visualize data
&lt;/h2&gt;

&lt;p&gt;Go to "Workbench" tab in RedisInsight and launch the generic query to show all nodes and relations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Routes&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a) RETURN a"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you should see a graph like this (remember to check the "All relationships" flag:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkp6ya9djotatzyh9d0o9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkp6ya9djotatzyh9d0o9.png" alt="Showing all nodes and edges" width="800" height="625"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Not really helpful, right? 😔&lt;br&gt;
Well, informations are there, now you have to query for what you want to know.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A little background:&lt;/strong&gt;&lt;br&gt;
Years ago a company asked me to resolve a test for an interview as backend developer. I had 3-4 days to solve this: "Imagine you have 20 goods depots, connected through cargo truck routes (each depot is connected with a maximum of 3 nearby depots). Write a script that - given deposit A and deposit B - finds the best route between them".&lt;br&gt;
I struggled for days with this test, trying to learn and implement Dijkstra algorithm... with no success at all! 🥲&lt;br&gt;
Well, if I knew about GraphDBs at that time, I would have passed that test! 🔝&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For example, if you want to know the shortest routes between Memphis and Phoenix, you could use this query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Routes&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (source:City {name: 'Memphis'}), (destination:City {name: 'Phoenix'}) with source, destination MATCH p=allShortestPaths((source)-[:ROUTE*]-&amp;gt;(destination)) RETURN nodes(p) as cities"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;we wrote a first &lt;code&gt;MATCH&lt;/code&gt; query, followeb by a second one; to use the matching nodes of the first &lt;code&gt;MATCH&lt;/code&gt; in the second one, we need to specify them with the &lt;code&gt;with&lt;/code&gt; keyword. This keyword is useful when you have to use one (or more) previously matched node in a following &lt;code&gt;MATCH&lt;/code&gt;.&lt;br&gt;
&lt;code&gt;allShortestPaths&lt;/code&gt; is a function of &lt;code&gt;MATCH&lt;/code&gt; that allows us to find every shortest paths between two nodes; if we choose to see results as text (and not the defaul graph) we could see that Redis found 2 paths:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclthg8up6dx3lyl25cll.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fclthg8up6dx3lyl25cll.png" alt="allShortestPaths match" width="800" height="175"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;if we wanted only one shortest path, we could use the &lt;code&gt;shortestPath&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Routes&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (source:City {name: 'Memphis'}), (destination:City {name: 'Phoenix'}) RETURN shortestPath((source)-[:ROUTE*]-&amp;gt;(destination))"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;if you look at the textual output of the results, you'll notice that this time RedisGraph returned not only the nodes, but also the relationships between nodes.&lt;br&gt;
If you only want the nodes involved in the &lt;code&gt;shortestPath&lt;/code&gt;, change the query like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Routes&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (source:City {name: 'Memphis'}), (destination:City {name: 'Phoenix'}) RETURN nodes(shortestPath((source)-[:ROUTE*]-&amp;gt;(destination)))"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cool, right?&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This series showed the basic of RedisGraph and the Cypher language.&lt;br&gt;
There is much more to explore and learn; if we look just at the "path algorithms", the official documentation offers many functions to use:&lt;br&gt;
&lt;a href="https://redis.io/docs/stack/graph/path_algorithm/#find-all-paths-from-a-if-the-trip-is-limited-to-10-kilometers" rel="noopener noreferrer"&gt;https://redis.io/docs/stack/graph/path_algorithm/#find-all-paths-from-a-if-the-trip-is-limited-to-10-kilometers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I hope you have followed this series and enjoyed it as much as I did!😊🎉&lt;/p&gt;

</description>
      <category>cypher</category>
      <category>redisgraph</category>
      <category>graph</category>
      <category>redis</category>
    </item>
    <item>
      <title>Getting started with Cypher and RedisGraph / Part II</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Fri, 19 May 2023 08:58:00 +0000</pubDate>
      <link>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-part-ii-83f</link>
      <guid>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-part-ii-83f</guid>
      <description>&lt;p&gt;In the first post of this series we looked at what a Graph DB is and why should you use one.&lt;br&gt;
Today we'll learn what is Cypher start querying the graph!&lt;/p&gt;
&lt;h2&gt;
  
  
  What is Cypher
&lt;/h2&gt;

&lt;p&gt;Cypher it's a declarative query language, developed internally by Neo4j since 2011.&lt;br&gt;
There is an ongoing standardization process (OpenCypher) to make Cypher an open standard.&lt;br&gt;
Cypher let's you easily retrieve data from the graph. It's one of the simplest query language to learn; its syntax is constructed in a way that helps visualize relationship between nodes&lt;/p&gt;
&lt;h2&gt;
  
  
  RedisInsight
&lt;/h2&gt;

&lt;p&gt;We will be running queries in RedisInsight. If you've read the previous post of this series, you should know how to run a docker based installation of Redis.&lt;br&gt;
After launching &lt;code&gt;docker compose up -d&lt;/code&gt;, point your browser to &lt;a href="http://localhost:8001" rel="noopener noreferrer"&gt;http://localhost:8001&lt;/a&gt; , RedisInsight should welcome you. Accept the EULA and you will be redirected to RedisInsight dashboard.&lt;br&gt;
Next click on the "Workbench" tab (the third icon in the side menu on the left).&lt;br&gt;
The upper right pane of this new screen is where you'll write all the queries.&lt;/p&gt;
&lt;h2&gt;
  
  
  Adding nodes and edges
&lt;/h2&gt;

&lt;p&gt;Now that both Redis and RedisInsight are up &amp;amp; running, let's create a new graph and add one node to it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"CREATE (:Person {name: 'Laura Phillips', age: 32})"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every query is prepended by &lt;code&gt;GRAPH.QUERY&lt;/code&gt; and the graph name.&lt;br&gt;
Now click the green arrow to the right or press &lt;code&gt;Ctrl + Enter&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The output should be similar to this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9409d6edagd1mdbu0v6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9409d6edagd1mdbu0v6.png" alt="after first CREATE" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first time we create e node in a non-existent graph, RedisGraph will create the graph for us.&lt;br&gt;
So we've just created a new graph called "Social" and added one node of type "Person".&lt;br&gt;
&lt;code&gt;name&lt;/code&gt; and &lt;code&gt;age&lt;/code&gt; are properties that we add to the node. RedisGrap automatically add another property to each node: the unique &lt;code&gt;id&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can add more nodes in bulk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"CREATE (:Person {name: 'Diana Hendrickson', age: 31}), (:Person {name: 'Susan Hendrickson', age: 29}), (:Person {name: 'Peter Steinmetz', age: 30}), (:Person {name: 'Louise Rosol', age: 29}), (:Person {name: 'Bryce Fett', age: 30})"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you should have 6 nodes of type &lt;code&gt;Person&lt;/code&gt; in the graph. Let's check them out!&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic querying
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person) RETURN p"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's analyze this query: &lt;code&gt;MATCH&lt;/code&gt; is the keyword that describes the relationship between queried entities. It's used to 'match' something in the graph, based on some parameters. In this case we are asking for &lt;code&gt;Person&lt;/code&gt;, and we alias our results with a &lt;code&gt;p&lt;/code&gt; (the alias name is not important, you can choose any letter or word).&lt;br&gt;
Nodes are always specified in round brackets.&lt;br&gt;
The &lt;code&gt;RETURN&lt;/code&gt; keyword returns every &lt;code&gt;p&lt;/code&gt; found.&lt;br&gt;
So this query returns every &lt;code&gt;Person&lt;/code&gt; it finds in the graph &lt;code&gt;Social&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The resulting graph should be something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgbsxxt0l0hystr093uo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdgbsxxt0l0hystr093uo.png" alt="MATCH-ing all nodes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The previous query could also be written as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a) RETURN a"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the generic way to view all nodes in the graph, you'll see this a &lt;strong&gt;lot of times&lt;/strong&gt;. Here we are using an alias &lt;code&gt;a&lt;/code&gt; without asking for a specific node type.&lt;/p&gt;

&lt;p&gt;Ok, so let's say we want to retrieve all Person(s) who are exactly 30 years old:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person {age: 30}) RETURN p"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;this query introduces another feature: match for node's attribute. Here we are matching only nodes of type &lt;code&gt;Person&lt;/code&gt; and with an attribute &lt;code&gt;age&lt;/code&gt; equal to &lt;code&gt;30&lt;/code&gt;.&lt;br&gt;
You can query for multiple attributes, by separating &lt;code&gt;attribute: value&lt;/code&gt; couples with comma between the curly brackets.&lt;/p&gt;

&lt;p&gt;Sometimes visualizing a graph of this kind isn't much helpful, a textual result might be more useful.&lt;br&gt;
So click on the &lt;code&gt;&amp;lt;/&amp;gt;&lt;/code&gt; icon just above the last graph and choose &lt;code&gt;Text&lt;/code&gt;; the output of the last query will be shown as text:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h3h3bshnyldq5bviy5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5h3h3bshnyldq5bviy5n.png" alt="MATCH results as text" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Connecting nodes and creating edges
&lt;/h2&gt;

&lt;p&gt;Now we add one relation between two nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Diana Hendrickson'}), (b:Person {name: 'Susan Hendrickson'}) CREATE (a)-[:KNOWS {relation: 'sister'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first part of the query is a MATCH to find 2 &lt;code&gt;Person&lt;/code&gt; ('Diana Hendrickson' and 'Susan Hendrickson'); we alias them with &lt;code&gt;a&lt;/code&gt; and &lt;code&gt;b&lt;/code&gt;.&lt;br&gt;&lt;br&gt;
Then we create the relation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:KNOWS&lt;/span&gt; &lt;span class="ss"&gt;{&lt;/span&gt;&lt;span class="py"&gt;relation:&lt;/span&gt; &lt;span class="s1"&gt;'sister'&lt;/span&gt;&lt;span class="ss"&gt;}]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;relationships are always written in square brackets. We are binding node &lt;code&gt;a&lt;/code&gt; and node &lt;code&gt;b&lt;/code&gt; with a relation of type &lt;code&gt;KNOWS&lt;/code&gt;. The &lt;code&gt;KNOWS&lt;/code&gt; relation has an attribute &lt;code&gt;relation&lt;/code&gt; with value &lt;code&gt;sister&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97vka7xht7waqd60c8cc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97vka7xht7waqd60c8cc.png" alt="Edge creation" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Redis is informing us that it has created one relationship with one properties.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; The correct way to add a relation between existent nodes is by matching them first; if we try to directly create the relationship, like so:&lt;br&gt;
&lt;code&gt;CREATE (a:Person {name: 'Diana Hendrickson'})-[:KNOWS {relation: 'sister'}]-&amp;gt;(b:Person {name: 'Susan Hendrickson'})&lt;/code&gt;&lt;br&gt;
RedisGraph would create 2 new nodes and add the relation between them&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The generic query for relationship has this form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NodeA&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="ss"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;:Relationship&lt;/span&gt;&lt;span class="ss"&gt;]&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NodeB&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;take one minute to appreciate how eloquent and visually clear the syntax of Cypher can be😯👏.&lt;/p&gt;

&lt;p&gt;Ok, now we can add some more relationships in the graph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Susan Hendrickson'}), (b:Person {name: 'Peter Steinmetz'}) CREATE (a)-[:KNOWS {relation: 'married'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Susan Hendrickson'}), (b:Person {name: 'Louise Rosol'}) CREATE (a)-[:KNOWS {relation: 'friend'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Peter Steinmetz'}), (b:Person {name: 'Bryce Fett'}) CREATE (a)-[:KNOWS {relation: 'friend'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Laura Phillips'}), (b:Person {name: 'Louise Rosol'}) CREATE (a)-[:KNOWS {relation: 'coworker'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Laura Phillips'}), (b:Person {name: 'Diana Hendrickson'}) CREATE (a)-[:KNOWS {relation: 'coworker'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (a:Person {name: 'Louise Rosol'}), (b:Person {name: 'Diana Hendrickson'}) CREATE (a)-[:KNOWS {relation: 'coworker'}]-&amp;gt;(b)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's see what we just did:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person) RETURN p"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the graph is &lt;strong&gt;exactly&lt;/strong&gt; the same as before... why?🤔&lt;/p&gt;

&lt;p&gt;The reason is that RedisInshight hides relationships by default; if you enable the "All relationship" slide in the upper right side of the graph, RedisInsight will also show you all the relationships between nodes:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy71johjkstfomdi5jcds.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy71johjkstfomdi5jcds.png" alt="All edges created" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now try to create a connection between an existent node and a non-existent node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person {name: 'Laura Phillips'}) CREATE (p)-[:KNOWS {relation: 'married'}]-&amp;gt;(:Person {name: 'William Stultz', age:33})"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;RedisGraph creates for us the unknown node (&lt;code&gt;Person&lt;/code&gt; 'William Stultz') and add the relationship between the known node and the new one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Querying for relationships
&lt;/h2&gt;

&lt;p&gt;Now that we've created some "connections" between nodes, we can query the graph for relation between nodes.&lt;br&gt;
Let's say we want to show every married &lt;code&gt;Person&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person)-[:KNOWS {relation:'married'}]-&amp;gt;(o:Person) RETURN p,o"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or we need the list of every &lt;code&gt;Person&lt;/code&gt; that knows 'Susan Hendrickson':&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person)-[:KNOWS]-(:Person {name: 'Susan Hendrickson'}) RETURN p"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;here we made some subtle but important changes to the query: first of all, we don't need the &lt;code&gt;Person&lt;/code&gt; 'Susan Hendrickson' so we haven't added an alias to it.&lt;br&gt;
Second: in RedisGraph every relation has a direction (starting from Node A ending on Node B). There's no point in giving the &lt;code&gt;:KNOWS&lt;/code&gt; relation a direction in the last query (two friends are friends to each other, it's a bi-directional relationship).&lt;br&gt;
So we query for the relation without specifying the direction; the syntax &lt;code&gt;(a)-[:KNOWS]-&amp;gt;(b)&lt;/code&gt; became &lt;code&gt;(a)-[:KNOWS]-(b)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What if instead of the list of acquaintances of 'Susan Hendrickson' we want just the count?&lt;br&gt;
We can use the &lt;code&gt;COUNT&lt;/code&gt; aggregation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person)-[:KNOWS]-(:Person {name: 'Susan Hendrickson'}) RETURN count(p)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
There is no graph to see here, so RedisGraph only shows the textual output of the &lt;code&gt;COUNT&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmaq1hqg8up4p3i3ey805.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmaq1hqg8up4p3i3ey805.png" alt="COUNT nodes" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Delete nodes and edges
&lt;/h2&gt;

&lt;p&gt;The last query we'll look at today is the &lt;code&gt;DELETE&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cypher"&gt;&lt;code&gt;&lt;span class="n"&gt;GRAPH.QUERY&lt;/span&gt; &lt;span class="n"&gt;Social&lt;/span&gt; &lt;span class="s2"&gt;"MATCH (p:Person {name: 'William Stultz'}) DELETE p"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;again, we first &lt;code&gt;MATCH&lt;/code&gt; the node, then we &lt;code&gt;DELETE&lt;/code&gt; it.&lt;br&gt;&lt;br&gt;
As you may remember, we created a relationship between this node and another &lt;code&gt;Person&lt;/code&gt; ('Laura Phillips'). When we delete a node, all its relationships are deleted too. This may seems obvious, but it's something to keep in mind.&lt;/p&gt;

&lt;h2&gt;
  
  
  RECAP
&lt;/h2&gt;

&lt;p&gt;In this post we learned the basics of Cypher and how to query RedisGraph. The topic is vast, we haveve covered a small part of it just to get you started.&lt;/p&gt;

&lt;p&gt;In the next (and last) post, I'll show you a more practical example of RedisGraph in action.&lt;/p&gt;

</description>
      <category>cypher</category>
      <category>redisgraph</category>
      <category>graph</category>
      <category>redis</category>
    </item>
    <item>
      <title>Getting started with Cypher and RedisGraph / Part I</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Mon, 15 May 2023 08:03:00 +0000</pubDate>
      <link>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-5479</link>
      <guid>https://dev.to/gfabrizi/getting-started-with-cypher-and-redisgraph-5479</guid>
      <description>&lt;p&gt;I keep a personal to-do list where i annotate technologies and frameworks i'd like to learn. Aside from the perpetual "learn Rust", there's been a "learn graph db" for months now.&lt;br&gt;
Now it's time to start exploring it.&lt;br&gt;
What better way to understand how much I've learned than writing a serie of posts about it?&lt;/p&gt;

&lt;p&gt;In this first post I try to explain what is a Graph DB, why I choose RedisGraph and how to use it locally. In the next post we will see how to query the db, how to edit and add relationships. In the third (and last) post we'll be looking at some examples.&lt;/p&gt;

&lt;p&gt;Without further ado, let's briefly see what a Graph database is.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61ttcbso71l4510mm062.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61ttcbso71l4510mm062.jpg" alt="Graph DB" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  What is a Graph database?
&lt;/h2&gt;

&lt;p&gt;In "classic" relational and NoSQL databases, data are structured in a tabular way (you may think of it as a spreadsheet); this makes querying from relations between data harder (usually you have to join two or more tables).&lt;br&gt;
Graph databases are structured following a graph data model.&lt;br&gt;
In "graph data model" relationships between nodes are the priority and the way you lay out your data is more expressive.&lt;br&gt;
Usually Graph databases come with a visualization tool that lets you see how your data is connected, thus helping you getting useful insight from your data.&lt;br&gt;
The rigid schema of a relational database make it hard seek for data relationships, whereas a graph db can make it easy.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why use a Graph database?
&lt;/h2&gt;

&lt;p&gt;So graph DBs allow you to better structure and visualize your data. But why you may need one?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Top use cases are: Real-time &lt;strong&gt;recommendation engines&lt;/strong&gt;, &lt;strong&gt;Knowledge graphs&lt;/strong&gt; and &lt;strong&gt;Fraud detection&lt;/strong&gt;&lt;br&gt;
But you can also think of a &lt;strong&gt;social media network&lt;/strong&gt; where people are connected with different kind of relations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Due to their flexibility, Graph DBs can easily adapt to your data model, even if it changes over time.&lt;br&gt;
One of the approach that is become more and more relevant is to use Graph databases in addition to traditional data store.&lt;br&gt;
For instance you can use a MySQL db to store your users, products and sales. Than you can have a Graph db to map your users habits and create your custom recommendation engine.&lt;br&gt;
You always have to keep in mind that each db type has it's own preferred use: graph db wins easily when you have to &lt;strong&gt;traverse relationships&lt;/strong&gt; between entities, while a SQL db is more efficient in aggregating and grouping results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fliegu3i6yal7dzap2ubp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fliegu3i6yal7dzap2ubp.png" alt="RedisGraph" width="607" height="194"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Why RedisGraph?
&lt;/h2&gt;

&lt;p&gt;Nowadays there are multiple Graph DBs: Neo4j is the first that comes in mind. Then there are AWS Neptune, ArangoDB, TypeDB, etc...&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Neo4j&lt;/strong&gt; has some tricky licensing and they seem to push hard on the "Enterprise" plan: with the community edition you can't have more than one database per installation and you can't create a cluster. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Neptune&lt;/strong&gt; is a "pay as you go solution", not suitable for learning. It uses Apache TinkerPop Gremlin and W3C’s SPARQL as query language. It's way too easy to be locked-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ArangoDB&lt;/strong&gt; has a unique multi-model approach (nodes are essentially documents stored in collections). It uses AQL, a proprietary query language.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TypeDB&lt;/strong&gt; uses it's own query language (TypeQL); you can be locked-in pretty quick.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedisGraph&lt;/strong&gt; is a pluggable module for Redis, developed by RedisLabs. While being younger than it's direct competitor Neo4j (and still not as feature rich), it's simpler and faster. New features are being included in each version.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Graph Elements
&lt;/h2&gt;

&lt;p&gt;A graph database consists of:&lt;br&gt;
&lt;strong&gt;&lt;em&gt;Nodes&lt;/em&gt;&lt;/strong&gt;: nodes represents entities in your domain.&lt;br&gt;
&lt;strong&gt;&lt;em&gt;Edges&lt;/em&gt;&lt;/strong&gt;: edges are the connections between nodes. An edge represents a relationship between a start node and an end node, thus giving the relationship a "direction". &lt;br&gt;
&lt;strong&gt;&lt;em&gt;Properties&lt;/em&gt;&lt;/strong&gt;: properties are key-value pairs connected to nodes or edges. With properties you can store extra informations about an entity or a relation&lt;br&gt;
&lt;strong&gt;&lt;em&gt;Labels&lt;/em&gt;&lt;/strong&gt;: labels help you categorize elements (nodes and edges)&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;An easy way to think about graphs is as analogous to the relationship between nouns and verbs. Nodes, or the nouns, are things such as people, places, and items. Relationships, or the verbs, are how they’re connected. People know each other and items are sent to places. The signal in those relationships is powerful.&lt;/em&gt; (&lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/analyze-graph-data-on-google-cloud-with-neo4j-and-vertex-ai" rel="noopener noreferrer"&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Running RedisGraph locally
&lt;/h2&gt;

&lt;p&gt;To get started you can go on &lt;a href="https://redis.com" rel="noopener noreferrer"&gt;https://redis.com&lt;/a&gt; and sign up for a free Redis cloud tier. The free subscription includes 1 dedicated database, 30MB of RAM and RedisGraph module, so you can use it to start learning RedisGraph.  &lt;/p&gt;

&lt;p&gt;The most interesting way to start exploring RedisGraph though is by running a local docker container.&lt;br&gt;
Here's a simple &lt;code&gt;docker-compose.yml&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3"&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;redis-stack&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis/redis-stack:latest&lt;/span&gt;
        &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-stack&lt;/span&gt;
        &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./local-data/:/data&lt;/span&gt;
        &lt;span class="na"&gt;expose&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="m"&gt;6379&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="m"&gt;8001&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8001:8001"&lt;/span&gt;
        &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;PATH=./bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can launch it with &lt;code&gt;docker compose up -d&lt;/code&gt;.&lt;br&gt;
If you open your browser and point it to &lt;code&gt;http://localhost:8001&lt;/code&gt; you will see the main interface of RedisInsight.&lt;br&gt;
RedisInsight is the web UI of Redis.&lt;br&gt;
When you finished, you can stop the container with &lt;code&gt;docker compose stop&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  In the next post
&lt;/h2&gt;

&lt;p&gt;We've covered the basics of what a Graph database is and how it can be used.&lt;br&gt;
In the next post we'll see how to create a database and use Cypher to query it.&lt;/p&gt;

</description>
      <category>cypher</category>
      <category>redisgraph</category>
      <category>graph</category>
      <category>redis</category>
    </item>
    <item>
      <title>✨ Porting Lambda Functions to AWS SAM</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Sun, 23 Apr 2023 21:55:22 +0000</pubDate>
      <link>https://dev.to/gfabrizi/porting-lambda-functions-to-aws-sam-3hl7</link>
      <guid>https://dev.to/gfabrizi/porting-lambda-functions-to-aws-sam-3hl7</guid>
      <description>&lt;p&gt;Two weeks ago I attended to JsDay 2023 (from Verona, Italy 🇮🇹).&lt;br&gt;
One talk hit me in particular: "&lt;em&gt;Production-ready lambdas with Node.js&lt;/em&gt;" by Luciano Mammino.&lt;br&gt;
He explained some trick, tips and best practices to work with AWS Lambda in a production environment.&lt;br&gt;
One best practice is this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Stop creating resources manually on your AWS account, like right now! If you are doing this, please STOP"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wwp1o1favxe7frrunt4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6wwp1o1favxe7frrunt4.png" alt="Everyone loves meme" width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ok, when he said this i felt reaaallly really guilty.&lt;/p&gt;
&lt;h2&gt;
  
  
  AWS SAM
&lt;/h2&gt;

&lt;p&gt;There are many IAC tools that can be used with AWS. Maybe the logic option (or the one that seems to fit better) is using AWS SAM.&lt;br&gt;
The AWS Serverless Application Model (SAM) is a framework for building serverless applications. It's open-source and all the configuration can be written in YAML files.&lt;br&gt;
It consists of a cli tool to be installed (it's separate from aws-cli); you can use it to deploy your infrastructure, deploy (and sync) your application, for local test of your code and much much more.&lt;br&gt;
I confess that i never used it (really never even heard of it...) so let's learn it by porting a previous project in SAM.&lt;/p&gt;
&lt;h2&gt;
  
  
  THE PROJECT
&lt;/h2&gt;

&lt;p&gt;The project is my previous "Lambda Inception Architectural Pattern" (quite a mouthful, right? 🤭):&lt;br&gt;
&lt;a href="https://dev.to/gfabrizi/lambda-inception-architectural-pattern-f67"&gt;https://dev.to/gfabrizi/lambda-inception-architectural-pattern-f67&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In that post I wrote blocks and blocks of code to describe roles and policies of the infrastructure... how naive!&lt;br&gt;
So let's start by looking at the SAM documentation:&lt;br&gt;
&lt;a href="https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-getting-started-hello-world.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-getting-started-hello-world.html&lt;/a&gt;&lt;br&gt;
After digging the examples and the documentation I started writing my &lt;code&gt;template.yaml&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  THE CODE
&lt;/h2&gt;

&lt;p&gt;First thing first: the roles and policies.&lt;br&gt;
The simplest role to be ported in SAM is the &lt;code&gt;LambdaInceptionWorker&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;LambdaInceptionWorker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::Role&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;RoleName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionWorker&lt;/span&gt;
    &lt;span class="na"&gt;AssumeRolePolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
      &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
          &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda.amazonaws.com&lt;/span&gt;
          &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts:AssumeRole&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;just some boilerplate code were we define a role (&lt;code&gt;Type: AWS::IAM::Role&lt;/code&gt;) and attach a &lt;code&gt;AssumeRolePolicyDocument&lt;/code&gt; that says that every Lambda functions can assume this role. As we saw in the previous post, the &lt;code&gt;LambdaInceptionWorker&lt;/code&gt; role is empty, so there's nothing more to add here.&lt;/p&gt;

&lt;p&gt;Next is the &lt;code&gt;LambdaInceptionManager&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;LambdaInceptionManager&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::Role&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;RoleName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionManager&lt;/span&gt;
    &lt;span class="na"&gt;AssumeRolePolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
      &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
          &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda.amazonaws.com&lt;/span&gt;
          &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts:AssumeRole&lt;/span&gt;
    &lt;span class="na"&gt;ManagedPolicyArns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole&lt;/span&gt;
    &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionPassRole&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;iam:PassRole&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::${AWS::AccountId}:role/LambdaInceptionWorker&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionCreateFunction&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;lambda:CreateFunction&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:*&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionDeleteFunction&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;lambda:DeleteFunction&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:*&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInvokeFunction&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;lambda:InvokeFunction&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:*&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's analyze the code section by section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;ManagedPolicyArns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;with this we are attaching an AWS managed policy to the role (the basic execution role, needed by Lambda Function URL).&lt;br&gt;
Then we started adding inline policies to the role; we see just the first policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionPassRole&lt;/span&gt;
  &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
    &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
        &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;iam:PassRole&lt;/span&gt;
        &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::${AWS::AccountId}:role/LambdaInceptionWorker&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;here we are creating a new inline policy called &lt;code&gt;LambdaInceptionPassRole&lt;/code&gt;; this policy allows the &lt;code&gt;iam:PassRole&lt;/code&gt; action only to the specified resource.&lt;br&gt;
In the &lt;code&gt;Resource&lt;/code&gt; line we specify the &lt;code&gt;LambdaInceptionWorker&lt;/code&gt; role by passing it's ARN. We are using a builtin variable to specify the account id (&lt;code&gt;${AWS::AccountId}&lt;/code&gt;). The keyword &lt;code&gt;!Sub&lt;/code&gt; at the beginning of the line indicates that the string contains a variable to be replaced with it's value. Another useful builtin variable is &lt;code&gt;${AWS::Region}&lt;/code&gt;.&lt;br&gt;
The others 3 policies have the same structure, so we skip them.&lt;/p&gt;

&lt;p&gt;Then we create the IAM user that will invoke the Inception manager function from the command line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;LambdaInceptionInvoker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::User&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;UserName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda-inception-invoker&lt;/span&gt;
    &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionInvoke&lt;/span&gt;
        &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2012-10-17'&lt;/span&gt;
          &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;lambda:InvokeFunctionUrl&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s"&gt;arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:lambda-inception&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The syntax of this block is the same as the previous, nothing new.&lt;/p&gt;

&lt;p&gt;Finally we can write the definition of the Inception Manager function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;LambdaInceptionManagerFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Function&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lambda-inception"&lt;/span&gt;
    &lt;span class="na"&gt;CodeUri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manager/&lt;/span&gt;
    &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manager.handler&lt;/span&gt;
    &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nodejs18.x&lt;/span&gt;
    &lt;span class="na"&gt;Architectures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;x86_64&lt;/span&gt;
    &lt;span class="na"&gt;MemorySize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;512&lt;/span&gt;
    &lt;span class="na"&gt;Timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;Role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionManager.Arn&lt;/span&gt;
    &lt;span class="na"&gt;FunctionUrlConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AuthType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS_IAM&lt;/span&gt;
    &lt;span class="na"&gt;Environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;LAMBDA_INCEPTION_WORKER_ROLE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;LambdaInceptionWorker.Arn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;we are using the type &lt;code&gt;AWS::Serverless::Function&lt;/code&gt; to specify that we are defining a Lambda Function.&lt;br&gt;
We gave it a name, specify the path where the code lies, the handler and some more common Lambda configuration.&lt;br&gt;&lt;br&gt;
Then we assign a role to the function. &lt;code&gt;!GetAtt&lt;/code&gt; is another keyword that returns an attribute; in this case it returns the ARN of the &lt;code&gt;LambdaInceptionManager&lt;/code&gt; role seen previously.&lt;br&gt;&lt;br&gt;
With the 3 last lines we pass a variable to the function handler. We can access this variable from the js code with &lt;code&gt;process.env.LAMBDA_INCEPTION_WORKER_ROLE&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  DEPLOY
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;sam build&lt;/code&gt; command is used to processes the AWS SAM template file, application code, and any applicable language-specific files and dependencies (i.e. &lt;code&gt;npm install&lt;/code&gt;).&lt;br&gt;&lt;br&gt;
Then we can launch &lt;code&gt;sam deploy&lt;/code&gt; to deploy the infrastructure and code on AWS&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdckxrpq95absrs25yxbe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdckxrpq95absrs25yxbe.png" alt="SAM succesfully deployed" width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  FINAL NOTES
&lt;/h2&gt;

&lt;p&gt;It was quite a journey 😅&lt;br&gt;
We saw how we can start using an IAC tool to define and manage a cloud infrastructure.&lt;br&gt;
The infrastructure as defined here is far from perfect, this is just a learn-by-doing exercise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The updated code can be downloaded from:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://github.com/gfabrizi/lambda-inception-sam" rel="noopener noreferrer"&gt;https://github.com/gfabrizi/lambda-inception-sam&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Leave a comment for questions or issues with the code&lt;br&gt;
Thank for reading! 👋&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>sam</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Lambda Inception Architectural Pattern</title>
      <dc:creator>Gianluca Fabrizi</dc:creator>
      <pubDate>Thu, 23 Mar 2023 15:48:48 +0000</pubDate>
      <link>https://dev.to/gfabrizi/lambda-inception-architectural-pattern-f67</link>
      <guid>https://dev.to/gfabrizi/lambda-inception-architectural-pattern-f67</guid>
      <description>&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;In some cases (some "shadier" than others) you may need to have multiple "clean" ip addresses. &lt;br&gt;
One use case could be the obvious web scraper. &lt;br&gt;
Usually when you make multiple GET requests to a host, you may get a captcha to prove you're a human, your ip may be rate limited or banned.&lt;/p&gt;

&lt;p&gt;Ideally you want something that is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;simple to use&lt;/li&gt;
&lt;li&gt;doesn't need additional software&lt;/li&gt;
&lt;li&gt;could be used with traditional CLI commands like cURL&lt;/li&gt;
&lt;li&gt;cheap&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can deal with this problem in several ways: use a vpn, use tor proxy (please, please, please &lt;strong&gt;don't&lt;/strong&gt; do this) or use some kind of throwaway ip. &lt;br&gt;
You could use an AWS EC2 instance to be your proxy; when the ip is banned, you take a snapshot of the instance and create a new one. &lt;br&gt;
...or you can be really creative and use AWS Lambda Functions...&lt;/p&gt;
&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;NOTE:&lt;/em&gt;&lt;/strong&gt; I never saw such pattern applied to Lambda functions, I checked several times. If you know someone who already did this, please let me know in the comments!&lt;/p&gt;

&lt;p&gt;The pattern is nothing new: you have a manager always running and one or more worker doing the heavy-lifting. &lt;br&gt;
If we have to keep going with the scraper example, the "heavy-lifting" here is making the GET requests.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why can't we use a single Lambda?&lt;/strong&gt;&lt;br&gt;
Lambda functions are by definition "serverless": they aren't tied to the underlying hardware, you can't make assumption about the hardware. &lt;br&gt;
On AWS the Lambda functions have an "execution context": on first run (cold start) AWS create an execution context for the function. &lt;br&gt;
This context is kept alive for an indefinite period of time (usually between 5 and 7 minutes). In this time your Lambda function will respond in less time (warm start) but keep the same ip address. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So we have a manager (let's call it "lambda-inception").&lt;br&gt;
This manager is a Lambda Function URL (a function with a dedicated url you can call). &lt;br&gt;
So we want to call the manager url, passing a payload consisting of the url to scrape.&lt;/p&gt;

&lt;p&gt;The manager will create a new Lambda worker Function (without URL, we don't need to access this function), pass the url to scrape to the function, await for response, destroy the worker and return the response to the client. &lt;br&gt;
We destroy the worker after each call because we want a clean ip every time.&lt;/p&gt;

&lt;p&gt;So without further ado this is the configuration on AWS:&lt;/p&gt;
&lt;h3&gt;
  
  
  Roles
&lt;/h3&gt;

&lt;p&gt;We need 2 roles (&lt;code&gt;LambdaInceptionWorker&lt;/code&gt; and &lt;code&gt;LambdaInceptionManager&lt;/code&gt;) with some custom policies (replace AWS_REGION and AWS_ACCOUNT_ID with your account information):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LambdaInceptionWorker&lt;/strong&gt;&lt;br&gt;
(no permission policies) We leave the worker role without policies (neither &lt;code&gt;AWSLambdaBasicExecutionRole&lt;/code&gt;) so it doesn't log anything on CloudWatch&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LambdaInceptionManager&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;LambdaInceptionPassRole&lt;/code&gt; (Custom created):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VisualEditor0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"iam:PassRole"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::AWS_ACCOUNT_ID:role/LambdaInceptionWorker"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;LambdaInceptionCreateFunction&lt;/code&gt; (Custom created):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VisualEditor0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lambda:CreateFunction"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:lambda:AWS_REGION:AWS_ACCOUNT_ID:function:*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;LambdaInceptionDeleteFunction&lt;/code&gt; (Custom created):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VisualEditor0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lambda:DeleteFunction"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:lambda:AWS_REGION:AWS_ACCOUNT_ID:function:*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;LambdaInvokeFunction&lt;/code&gt; (Custom created):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VisualEditor0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lambda:InvokeFunction"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:lambda:AWS_REGION:AWS_ACCOUNT_ID:function:*"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;AWSLambdaBasicExecutionRole&lt;/code&gt; (AWS standard policy)&lt;/p&gt;

&lt;h3&gt;
  
  
  Users
&lt;/h3&gt;

&lt;p&gt;If you want to keep your Functions protected (why wouldn't you?) you need to create an IAM user (call it &lt;code&gt;lambda-inception-invoker&lt;/code&gt;) with &lt;code&gt;LambdaInceptionInvoke&lt;/code&gt; custom policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VisualEditor0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lambda:InvokeFunctionUrl"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:lambda:AWS_REGION:AWS_ACCOUNT_ID:function:lambda-inception"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need programmatic access for this user, so after creating the user go to the "Security Credentials" tab and create an access key.&lt;/p&gt;

&lt;p&gt;You may need a &lt;code&gt;lambda-deployer&lt;/code&gt; user with &lt;code&gt;AWSLambda_FullAccess&lt;/code&gt; permission policy if you want to use the script provided on the github repository to deploy the infrastructure on your account.&lt;/p&gt;

&lt;h2&gt;
  
  
  The code
&lt;/h2&gt;

&lt;p&gt;The code is available here: &lt;a href="https://github.com/gfabrizi/lambda-inception" rel="noopener noreferrer"&gt;https://github.com/gfabrizi/lambda-inception&lt;/a&gt; &lt;br&gt;
The code uses a mono-lambda approach to manage the routing.&lt;br&gt;&lt;br&gt;
&lt;code&gt;src/app.mjs&lt;/code&gt; and &lt;code&gt;src/res.mjs&lt;/code&gt; are responsible for managing the basic routing and create an evelope for the response.&lt;/p&gt;

&lt;p&gt;The code for the manager lies in &lt;code&gt;index.js&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;worker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createWorker&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lambdaClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LambdaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;awsRegion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2015-03-31&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;invokeCommand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;InvokeCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FunctionName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;accept&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/html,application/xhtml+xml,application/xml;q=0.9&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;contentType&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;contentType&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/octet-stream&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;method&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="na"&gt;LogType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;LogType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Tail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;LogResult&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;lambdaClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;invokeCommand&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;deleteCommand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DeleteFunctionCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;FunctionName&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;lambdaClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;deleteCommand&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;here we create a new worker, invoke the worker (passing the url, the method and a few headers).&lt;br&gt;
Then we await for the response of the worker, destroy the worker and return the scraped html to the client.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;src/create-worker.mjs&lt;/code&gt; wraps the functionality to create a new worker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LambdaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;awsRegion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2015-03-31&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;functionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;worker-&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;functionCommand&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CreateFunctionCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;ZipFile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;worker.zip&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;Architectures&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;Architecture&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x86_64&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;functionName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;worker.handler&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;PackageType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PackageType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Zip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;lambdaInceptionWorkerRole&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;nodejs18.x&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Crawler Worker &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;functionName&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;functionCommand&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;nothing too fancy here: we specify an archive (worker.zip) to be used for creating the worker.&lt;/p&gt;

&lt;p&gt;The worker code (&lt;code&gt;worker/worker.js&lt;/code&gt;) is just a call to got.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;got&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;getUserAgent&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We use a simple user-agent rotation, just in case...&lt;/p&gt;

&lt;h2&gt;
  
  
  Make a request
&lt;/h2&gt;

&lt;p&gt;So if everything is correctly configured (refer also to the README.md file in the repository) you can run &lt;code&gt;./deploy.sh&lt;/code&gt; to deploy the manager on a Lambda Function.&lt;/p&gt;

&lt;p&gt;Then you can call your function with a simple cURL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://XXXXXXXXXXXXXXXXXXXXXX.lambda-url.AWS_REGION.on.aws/crawl &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"url":"https://ifconfig.me/", "accept":"application/json", "method":"GET"}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--user&lt;/span&gt; XXXXXXXXXXXXXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--aws-sigv4&lt;/span&gt; &lt;span class="s2"&gt;"aws:amz:AWS_REGION:lambda"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(replace all the &lt;code&gt;X&lt;/code&gt;s with yout function url and account details. Also replace &lt;code&gt;AWS_REGION&lt;/code&gt; with the region you have used)&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;p&gt;The code is available here: &lt;a href="https://github.com/gfabrizi/lambda-inception" rel="noopener noreferrer"&gt;https://github.com/gfabrizi/lambda-inception&lt;/a&gt;  &lt;/p&gt;

</description>
      <category>lambda</category>
      <category>aws</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
