<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chillar Anand</title>
    <description>The latest articles on DEV Community by Chillar Anand (@chillaranand).</description>
    <link>https://dev.to/chillaranand</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F18842%2F46af4a1c-bedc-43c2-8e22-40529b6092ad.jpg</url>
      <title>DEV Community: Chillar Anand</title>
      <link>https://dev.to/chillaranand</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/chillaranand"/>
    <language>en</language>
    <item>
      <title>Free DockerHub Alternative - ECR Public Gallery</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sun, 09 Feb 2025 16:08:34 +0000</pubDate>
      <link>https://dev.to/chillaranand/free-dockerhub-alternative-ecr-public-gallery-b1n</link>
      <guid>https://dev.to/chillaranand/free-dockerhub-alternative-ecr-public-gallery-b1n</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffarcwcef81jp62cd9oen.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffarcwcef81jp62cd9oen.png" alt="docker-rate-limits" width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;DockerHub started rate limiting&lt;sup id="fnref:rate"&gt;&lt;a href="https://avilpage.com/2025/02/free-dockerhub-alternative-ecr-gallery.html#fn:rate" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt; anonymous docker pulls. When testing out a new CI/CD setup, I hit the rate limit and had to wait for an hour to pull the image. This was a good time to look for alternatives.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://gallery.ecr.aws/" rel="noopener noreferrer"&gt;AWS ECR Public Gallery&lt;/a&gt;&lt;sup id="fnref:ecr"&gt;&lt;a href="https://avilpage.com/2025/02/free-dockerhub-alternative-ecr-gallery.html#fn:ecr" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt; is a good alternative to DockerHub as of today(2025 Feb). It is free and does not have rate limits even for anonymous users.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpq5hibudhz0wkmbdc501.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpq5hibudhz0wkmbdc501.png" alt="public-ecr-gallery" width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once we find the required image from the gallery, we can simply change the image name in the &lt;code&gt;docker pull&lt;/code&gt; command to pull the image from ECR Gallery.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker pull public.ecr.aws/ubuntu/ubuntu

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;Dockerfile&lt;/code&gt;, we can use the image from ECR Gallery as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM public.ecr.aws/ubuntu/ubuntu

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is a quick way to avoid DockerHub rate limits.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://docs.docker.com/docker-hub/usage/" rel="noopener noreferrer"&gt;DockerHub Limits&lt;/a&gt; &lt;a href="https://avilpage.com/2025/02/free-dockerhub-alternative-ecr-gallery.html#fnref:rate" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://gallery.ecr.aws" rel="noopener noreferrer"&gt;AWS ECR Public Gallery&lt;/a&gt; &lt;a href="https://avilpage.com/2025/02/free-dockerhub-alternative-ecr-gallery.html#fnref:ecr" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devops</category>
      <category>docker</category>
    </item>
    <item>
      <title>Postman - Auto Login &amp; Renew OAuth2 Token</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Fri, 31 Jan 2025 17:20:17 +0000</pubDate>
      <link>https://dev.to/chillaranand/postman-auto-login-renew-oauth2-token-1j83</link>
      <guid>https://dev.to/chillaranand/postman-auto-login-renew-oauth2-token-1j83</guid>
      <description>&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;When using Postman to interact with APIs behind an OAuth2 authentication, we need to login and renew the token manually. This can be automated using the following steps.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set credentials in environment variables&lt;/li&gt;
&lt;li&gt;Create a pre-request script to login and renew the token&lt;/li&gt;
&lt;li&gt;Use the token in the request headers&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Automating Login &amp;amp; Renewal
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;var e = pm.environment;
var isSessionExpired = true;

var loginTimestamp = e.get("loginTimestamp");
var expiresInSeconds = pm.environment.get("expiresInSeconds") || 86400;

if (loginTimestamp) {
  var loginDuration = Date.now() - loginTimestamp;
  isSessionExpired = loginDuration &amp;gt;= expiresInSeconds;
}

if (isSessionExpired) {
  pm.sendRequest({
    url: e.get('host') + "/auth/connect/token",
    method: 'POST',
    header: {
      'Content-Type': 'application/x-www-form-urlencoded',
      'Accept': 'application/json'
    },
    body: {
        mode: 'urlencoded',
        urlencoded: [
          { key: "username", value: e.get('username') },
          { key: "password", value: e.get('password') },
          { key: "grant\_type", value: "password" },
          { key: "client\_id", value: e.get("client\_id") }
        ]
    }
  }, function (err, res) {
    jsonData = res.json();

    e.set("access\_token", jsonData.access\_token);

    if(res.json().expires\_in){
        expiresInSeconds = res.json().expires\_in \* 1000;
    }
    e.set("expiresInSeconds", expiresInSeconds);
    e.set("loginTimestamp", Date.now())
  });
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can copy this script to the pre-request script of the collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lvavxa5ase8b2acitur.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lvavxa5ase8b2acitur.png" alt="Cockpit" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most of the script is self-explanatory. The script checks if the session is expired and sends a request to the token endpoint to get a new token. The token is stored in environment variables and used in the request headers.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;This is a one time setup for Postman collection and it saves a lot of time in the long run. The script can be modified to handle different grant types and token renewal strategies.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Install Cockpit on Remote Linux VM</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Mon, 30 Dec 2024 22:54:07 +0000</pubDate>
      <link>https://dev.to/chillaranand/install-cockpit-on-remote-linux-vm-3pah</link>
      <guid>https://dev.to/chillaranand/install-cockpit-on-remote-linux-vm-3pah</guid>
      <description>&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcfiif1l9jqmurwd1s41.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftcfiif1l9jqmurwd1s41.png" alt="Cockpit" width="800" height="459"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cockpit is an easy to use web-based interface(like a cPanel) for managing Linux servers. When we want to provide access to non-developers or people who are new to linux, it is a good idea to get them started with Cockpit. It provides a user-friendly interface to manage services, containers, storage, logs, and more.&lt;/p&gt;

&lt;h4&gt;
  
  
  Setup
&lt;/h4&gt;

&lt;p&gt;Let's create a new Ubuntu VM and install Cockpit on it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt update
. /etc/os-release
sudo apt install -t ${VERSION\_CODENAME}-backports cockpit

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the installation is complete, we can get the public ip of the VM and access the Cockpit web interface running on port 9090.&lt;/p&gt;

&lt;p&gt;It will be difficult to remember the public ip of the VM. So, let's create a DNS record for the VM. Let's add an &lt;code&gt;A&lt;/code&gt; record in DNS settings to point &lt;code&gt;cockpit.avilpage.com&lt;/code&gt; to the public ip of the VM.&lt;/p&gt;

&lt;h4&gt;
  
  
  Reverse Proxy
&lt;/h4&gt;

&lt;p&gt;Let's set up a reverse proxy to access the Cockpit web interface using a subdomain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt install caddy

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add the below configuration to &lt;code&gt;/etc/caddy/Caddyfile&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cockpit.avilpage.com {
    reverse_proxy localhost:9090
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need &lt;code&gt;Origins&lt;/code&gt; to Cockpit configuration at &lt;code&gt;/etc/cockpit/cockpit.conf&lt;/code&gt; to allow requests from the subdomain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[WebService]
Origins = https://cockpit.avilpage.com

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart both services and open &lt;a href="https://cockpit.avilpage.com" rel="noopener noreferrer"&gt;https://cockpit.avilpage.com&lt;/a&gt; in browser.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo systemctl restart cockpit
sudo systemctl restart caddy

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;Cockpit web UI is a great tool to manage Linux servers even for non-developers. Users can browse/manage logs, services, etc. It also provides a terminal to run commands on the server&lt;/p&gt;

</description>
      <category>linux</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Cube &amp; Cubicle</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Thu, 31 Oct 2024 03:35:37 +0000</pubDate>
      <link>https://dev.to/chillaranand/cube-cubicle-5cde</link>
      <guid>https://dev.to/chillaranand/cube-cubicle-5cde</guid>
      <description>&lt;h4&gt;
  
  
  Rubiks Cube
&lt;/h4&gt;

&lt;p&gt;When I was in college, I was traveling to a friend's place and missed bus at midnight. The next bus was at 4 AM. While I was bored waiting for the bus, I found Rubik's Cube in a shop.&lt;/p&gt;

&lt;p&gt;I scrambled the cube and spent the next 4 hours trying to solve the cube. I managed to solve one color. When I tried to solve the next color, the pieces in the previous layer started missing.&lt;/p&gt;

&lt;p&gt;Even after spending a lot of time in the next 3 weeks, I couldn't solve it and gave up.&lt;/p&gt;

&lt;p&gt;After a couple of years, when I "learnt" about the internet, I searched and found simple algorithms to solve the cube. Within a few days, I was able to solve the cube in a minute.&lt;/p&gt;

&lt;h4&gt;
  
  
  Office Cubicles
&lt;/h4&gt;

&lt;p&gt;In the final year of college, there were placements. When I was preparing resume, I included "I can solve Rubik's Cube in a minute" in it.&lt;/p&gt;

&lt;p&gt;During the interview, interviewer asked me if I can really solve the cube in a minute. He asked me to get my cube and show him during the lunch break. I did. Luckily, I got hired.&lt;/p&gt;

&lt;p&gt;Even though, I was hired for Wipro I didn't join. I went to Bangalore and started applying for start-up jobs.&lt;/p&gt;

&lt;p&gt;I went for an interview at a web development company in Malleswaram, Bangalore. The CEO looked at my résumé, took out a cube from his desk. He handed the cube to me, showed an empty cubicle behind me and said, "If you solve the cube in a minute, that cubicle is yours."&lt;/p&gt;

&lt;p&gt;Just by learning the cube, I was able to land a job an at an MNC(Multi National Company) and a startup as well.&lt;/p&gt;

</description>
      <category>musings</category>
      <category>rubikscube</category>
    </item>
    <item>
      <title>tailscale: Resolving CGNAT (100.x.y.z) Conflicts</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sat, 07 Sep 2024 07:20:05 +0000</pubDate>
      <link>https://dev.to/chillaranand/tailscale-resolving-cgnat-100xyz-conflicts-56lf</link>
      <guid>https://dev.to/chillaranand/tailscale-resolving-cgnat-100xyz-conflicts-56lf</guid>
      <description>&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;In an earlier blog post, I wrote about using tailscale to remotely access any device&lt;sup id="fnref:ap"&gt;&lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fn:ap" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt;. Tailscale uses 100.64.0.0/10 subnet&lt;sup id="fnref:ts100"&gt;&lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fn:ts100" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt; to assign unique IP addresses to each device.&lt;/p&gt;

&lt;p&gt;When a tailscale node joins another campus network&lt;sup id="fnref:pn"&gt;&lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fn:pn" rel="noopener noreferrer"&gt;3&lt;/a&gt;&lt;/sup&gt; (schools, universities, offices) that uses the same subnet, it will face conflicts. Let's see how to resolve this.&lt;/p&gt;

&lt;h4&gt;
  
  
  Private Network
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5w4clfovv83mgb07oa9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5w4clfovv83mgb07oa9.png" alt="tailscale dashboard" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the above scenario, node C1 will be able to connect C2 &amp;amp; C3 as they are in the same network.&lt;/p&gt;

&lt;p&gt;Once we start tailscale on node C1, it will get a 100.x.y.z IP address from tailscale subnet. Now, node C1 will not be able to connect to node C2 &amp;amp; C3.&lt;/p&gt;

&lt;p&gt;To avoid conflicts with the existing network, we can configure tailscale to use a "smaller" subnet using "ipPool".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "acls": [
        "..."
    ],
    "nodeAttrs": [
        {
            "target": [
                "autogroup:admin"
            ],
            "ipPool": [
                "100.100.96.0/20"
            ]
        }
    ]
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once it is configured, taiscale will start assigning IP addresses from the new subnet. Even though ip address allocation is limited, we can't still access nodes in other subnets due to a bug&lt;sup id="fnref:tsb"&gt;&lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fn:tsb" rel="noopener noreferrer"&gt;5&lt;/a&gt;&lt;/sup&gt; in tailscale.&lt;/p&gt;

&lt;p&gt;As a workaround, we can manually update the iptables to route traffic to the correct subnet.&lt;/p&gt;

&lt;p&gt;Lets look at the iptables rules added by tailscale by stopping it and then starting it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vxvw1gewibgfb8vwer1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vxvw1gewibgfb8vwer1.png" alt="tailscale iptables rules" width="800" height="255"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kyev84haf20fzhp24kn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kyev84haf20fzhp24kn.png" alt="tailscale iptables rules" width="800" height="540"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The highlighted rule drops any incoming packet that doesn't originate from tailscale0 interface, and source IP is 100.64.0.0/10 (100.64.0.0 to 100.127.255.255).&lt;/p&gt;

&lt;p&gt;Let's delete this rule and add a new rule to restrict the source IP to 100.100.96.0/20 (100.100.96.1 to 100.100.111.254).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ sudo iptables --delete ts-input --source 100.64.0.0/10 ! -i tailscale0 -j DROP
$ sudo iptables --insert ts-input 3 --source 100.100.96.0/20 ! -i tailscale0 -j DROP

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspqisr90v6r4lom90jnh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspqisr90v6r4lom90jnh.png" alt="tailscale iptables rules" width="800" height="582"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;By configuring tailscale to use a smaller subnet, we can avoid conflicts with existing networks. Even though there is a bug in tailscale, we can manually update iptables to route traffic to the correct subnet.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://avilpage.com/2023/09/tailscale-remote-ssh-raspberry-pi.html" rel="noopener noreferrer"&gt;tailscale: Remotely access any device&lt;/a&gt; &lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fnref:ap" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://tailscale.com/kb/1015/100.x-addresses" rel="noopener noreferrer"&gt;https://tailscale.com/kb/1015/100.x-addresses&lt;/a&gt; &lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fnref:ts100" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Private_network" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Private_network&lt;/a&gt; &lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fnref:pn" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://tailscale.com/kb/1304/ip-pool" rel="noopener noreferrer"&gt;https://tailscale.com/kb/1304/ip-pool&lt;/a&gt; &lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fnref:ip" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/tailscale/tailscale/issues/1381" rel="noopener noreferrer"&gt;https://github.com/tailscale/tailscale/issues/1381&lt;/a&gt; &lt;a href="https://avilpage.com/2024/09/tailscale-cgnat-conflicts-resolution.html#fnref:tsb" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>networking</category>
      <category>tailscale</category>
    </item>
    <item>
      <title>Mastering Kraken2 - Part 4 - Build FDA-ARGOS Index</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sat, 24 Aug 2024 09:58:00 +0000</pubDate>
      <link>https://dev.to/chillaranand/mastering-kraken2-part-4-build-fda-argos-index-bfb</link>
      <guid>https://dev.to/chillaranand/mastering-kraken2-part-4-build-fda-argos-index-bfb</guid>
      <description>&lt;h4&gt;
  
  
  Mastering Kraken2
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html" rel="noopener noreferrer"&gt;Part 1 - Initial Runs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html" rel="noopener noreferrer"&gt;Part 2 - Classification Performance Optimisation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html" rel="noopener noreferrer"&gt;Part 3 - Build custom database indices&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html" rel="noopener noreferrer"&gt;Part 4 - Buil FDA-ARGOS index&lt;/a&gt; (this post)&lt;/p&gt;

&lt;p&gt;Part 5 - Regular vs Fast Builds (upcoming)&lt;/p&gt;

&lt;p&gt;Part 6 - Benchmarking (upcoming)&lt;/p&gt;

&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;In the previous post, we learnt how to build a custom index for Kraken2.&lt;/p&gt;

&lt;p&gt;FDA-ARGOS&lt;sup id="fnref:argos"&gt;&lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fn:argos" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt; is a popular database with quality reference genomes for diagnostic usage. Let's build an index for FDA-ARGOS.&lt;/p&gt;

&lt;h4&gt;
  
  
  FDA-ARGOS Kraken2 Index
&lt;/h4&gt;

&lt;p&gt;FDA-ARGOS db is available at NCBI&lt;sup id="fnref:ncbi"&gt;&lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fn:ncbi" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt; from which we can download the assembly file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cnyxh3r9n5asqpzu92v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cnyxh3r9n5asqpzu92v.png" alt="FDA-ARGOS NCBI" width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can extract accession numbers from the assembly file and then download the genomes from these accession ids.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ grep -e "^#" -v PRJNA231221_AssemblyDetails.txt | cut -d$'\t' -f1 &amp;gt; accessions.txt

$ wc accessions.txt
 1428 1428 22848 accessions.txt

$ ncbi-genome-download --section genbank --assembly-accessions accessions.txt --progress-bar bacteria --parallel 40

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It took ~8 minutes to download all the genomes, and the downloaded file size is ~4GB.&lt;/p&gt;

&lt;p&gt;We can use kraken-db-builder&lt;sup id="fnref:kdb"&gt;&lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fn:kdb" rel="noopener noreferrer"&gt;3&lt;/a&gt;&lt;/sup&gt; tool to build index from these genbank genome files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# kraken-db-builder needs this to convert gbff to fasta format
$ conda install -c bioconda any2fasta

$ kraken-db-builder --genomes-dir genbank --threads 36 --db-name k2_argos

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It took ~30 minutes to build the index.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;We have built a Kraken2 index for the FDA-ARGOS database on 2024-Aug-24.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/ChillarAnand/avilpage.com/tree/master/scripts/kraken2_argos" rel="noopener noreferrer"&gt;FDA-ARGOS Library&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://drive.google.com/file/d/1PbwriW3i3pkXJMFF5nq9OK_EqrwPiLWr/view" rel="noopener noreferrer"&gt;Kraken2 Gzipped Index file&lt;/a&gt; (gzip size: 2.6GB, index size: 3.8GB, md5sum: 1dd946d2e405dfec35ed3e319e9dfeac)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ChillarAnand/avilpage.com/tree/master/scripts/kraken2_argos" rel="noopener noreferrer"&gt;Kraken2 Inspect file&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the next post, we will look at the differences between regular and fast builds.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.nature.com/articles/s41467-019-11306-6" rel="noopener noreferrer"&gt;https://www.nature.com/articles/s41467-019-11306-6&lt;/a&gt; &lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fnref:argos" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.ncbi.nlm.nih.gov/bioproject/231221" rel="noopener noreferrer"&gt;https://www.ncbi.nlm.nih.gov/bioproject/231221&lt;/a&gt; &lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fnref:ncbi" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://avilpage.com/kdb.html" rel="noopener noreferrer"&gt;https://avilpage.com/kdb.html&lt;/a&gt; &lt;a href="https://avilpage.com/2024/08/mastering-kraken2-fda-argos-index.html#fnref:kdb" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>Midnight Coding for Narendra Modi &amp; Ivanka Trump</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sun, 18 Aug 2024 00:25:43 +0000</pubDate>
      <link>https://dev.to/chillaranand/midnight-coding-for-narendra-modi-ivanka-trump-4pjf</link>
      <guid>https://dev.to/chillaranand/midnight-coding-for-narendra-modi-ivanka-trump-4pjf</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4L89Z6X3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/midnight-modi-trump.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4L89Z6X3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/midnight-modi-trump.png" alt="GES 2017, modi trump mitra" width="800" height="536"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;In 2017, GES Event was held in Hyderabad, India. Narendra Modi (the Prime Minister of India) &amp;amp; Ivanka Trump (daughter of the then US President Donald Trump) were the chief guests.&lt;/p&gt;

&lt;p&gt;At that time, I was part of Invento team, and we decided to develop a new version of Mitra robot for the event.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Challenge
&lt;/h4&gt;

&lt;p&gt;We had to develop the new version of Mitra robot in a short span of time. Entire team worked day and night to meet the deadlines and finish the new version.&lt;/p&gt;

&lt;p&gt;We went to Hyderabad from Bangalore a few days before to prepare for the event. We have cleared multiple security checks, did some demos for various people before the event.&lt;/p&gt;

&lt;p&gt;A day before the event, around 9 PM we discovered a critical bug in the software. Due to that bug, the Robot motors were running at full speed which was dangerous. If the robot hits someone at full speed, it could cause serious injuries.&lt;/p&gt;

&lt;p&gt;I spent a few hours debugging the issue and even tried rolling back a few versions. Still, I couldn't pinpoint the issue.&lt;/p&gt;

&lt;p&gt;Since we need only a small set of Robot features, we decided to create a new version of the software with only limited features. I spent the next few hours creating a new release.&lt;/p&gt;

&lt;p&gt;After that, we spent the next few hours doing extensive testing to make sure there are no bugs in the new version.&lt;/p&gt;

&lt;p&gt;It was almost morning by the time we were done with testing. We quickly went to hotel to have some rest and get back early for the event.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;Mitra robot welcoming Modi &amp;amp; Trump went very well. You can read about Balaji Viswanathan's experience at GES 2017 on Quora&lt;sup id="fnref:quora"&gt;&lt;a href="https://avilpage.com/2024/08/midnight-coding-narendra-modi-ivanka-trump.html#fn:quora" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9-nPcjjB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/midnight-modi-trump-anand.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9-nPcjjB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/midnight-modi-trump-anand.jpg" alt="GES 2017, modi trump mitra anand" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://www.quora.com/How-was-Balaji-Viswanathans-overall-experience-attending-the-Global-Entrepreneurship-Summit-2017-held-in-Hyderabad-Where-his-Inventos-Mitra-is-launched" rel="noopener noreferrer"&gt;Answer on Quora&lt;/a&gt; &lt;a href="https://avilpage.com/2024/08/midnight-coding-narendra-modi-ivanka-trump.html#fnref:quora" rel="noopener noreferrer"&gt;↩&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>musings</category>
    </item>
    <item>
      <title>How (and when) to use systemd timer instead of cronjob</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Mon, 05 Aug 2024 07:37:50 +0000</pubDate>
      <link>https://dev.to/chillaranand/how-and-when-to-use-systemd-timer-instead-of-cronjob-15d</link>
      <guid>https://dev.to/chillaranand/how-and-when-to-use-systemd-timer-instead-of-cronjob-15d</guid>
      <description>&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;* * * * * bash demo.sh

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just a single line of code is sufficient to schedule a cron job. However, there are some scenarios where I find systemd timer more useful than cronjob.&lt;/p&gt;

&lt;h4&gt;
  
  
  How to use systemd timer
&lt;/h4&gt;

&lt;p&gt;We need to create a service file(contains the script to be run) and a timer(contains the schedule).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# demo.service
[Unit]
Description=Demo service

[Service]
ExecStart=bash demo.sh


# demo.timer
[Unit]
Description=Run myscript.service every 1 minutes

[Timer]
OnBootSec=1min
OnUnitActiveSec=1min

[Install]
WantedBy=multi-user.target

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can copy these files to &lt;code&gt;/etc/systemd/system/&lt;/code&gt; and enable the timer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ sudo cp demo.service demo.timer /etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable --now demo.timer

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can use &lt;code&gt;systemctl&lt;/code&gt; to see when the task is executed last and when it will be executed next.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ sudo systemctl list-timers --all

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--AfCfx-Jk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/systemd-timer-cronjob.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--AfCfx-Jk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/systemd-timer-cronjob.png" alt="systemd timer" width="800" height="318"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Use Cases
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Singleton - In the above example, lets say &lt;code&gt;demo.sh&lt;/code&gt; takes ~10 minutes to run. With cron job, in ten minutes we will have 10 instances of &lt;code&gt;demo.sh&lt;/code&gt; running. This is not ideal. With systemd timer, it will ensure only one instance of &lt;code&gt;demo.sh&lt;/code&gt; is running at a time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;On demand runs - If we want to test out the script/job, systemd allows us to immediately run it with usual &lt;code&gt;systemctl start demo&lt;/code&gt; without needing to run the script manually.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Timer - With cron, we can run tasks upto a minute precision. Timer can run tasks till &lt;code&gt;second&lt;/code&gt; level precision.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Timer]
OnCalendar=*-*-* 15:30:15

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In addition to that, we can run tasks based on system events. For example, we can run a script 15 minutes from reboot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Timer]
OnBootSec=15min

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;Systemd timer is a powerful tool that can replace cronjob in many scenarios. It provides more control and flexibility over cronjob. However, cronjob is still a good choice for simple scheduling tasks.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>devops</category>
    </item>
    <item>
      <title>Mastering Kraken2 - Part 3 - Build Custom Database</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Thu, 01 Aug 2024 05:22:30 +0000</pubDate>
      <link>https://dev.to/chillaranand/mastering-kraken2-part-3-build-custom-database-3gkb</link>
      <guid>https://dev.to/chillaranand/mastering-kraken2-part-3-build-custom-database-3gkb</guid>
      <description>&lt;h4&gt;
  
  
  Mastering Kraken2
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html" rel="noopener noreferrer"&gt;Part 1 - Initial Runs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html" rel="noopener noreferrer"&gt;Part 2 - Classification Performance Optimisation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html" rel="noopener noreferrer"&gt;Part 3 - Building custom databases&lt;/a&gt; (this post)&lt;/p&gt;

&lt;p&gt;Part 4 - Regular vs Fast Builds (upcoming)&lt;/p&gt;

&lt;p&gt;Part 5 - Benchmarking (upcoming)&lt;/p&gt;

&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;In the previous post, we learned how to improve kraken2&lt;sup id="fnref:k2"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fn:k2" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt; classification performance. So far we have downloaded &amp;amp; used pre-built genome indices(databases).&lt;/p&gt;

&lt;p&gt;In this post, let's build a custom database for kraken2. For simplicity, let's use only refseq archaea genomes&lt;sup id="fnref:rag"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fn:rag" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt; for building the index.&lt;/p&gt;

&lt;h4&gt;
  
  
  Building Custom Database
&lt;/h4&gt;

&lt;p&gt;First, we need to download the taxonomy files. We can use the &lt;code&gt;k2&lt;/code&gt; script provided by kraken2.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k2 download-taxonomy --db custom_db

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This takes ~30 minutes depending on the network speed. The taxonomy files are downloaded to the &lt;code&gt;custom_db/taxonomy&lt;/code&gt; directory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ ls custom_db/taxonomy
citations.dmp division.dmp gencode.dmp merged.dmp nodes.dmp
nucl_wgs.accession2taxid delnodes.dmp gc.prt 
images.dmp names.dmp nucl_gb.accession2taxid readme.txt

$ du -hs custom_db/taxonomy
43G custom_db/taxonomy

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For simplicity, let's use the archaea refseq genomes. We can use &lt;code&gt;kraken2-build&lt;/code&gt; to download the refseq genomes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k2 download-library --library archaea --db custom_db

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs on a single thread. Instead of using &lt;code&gt;kraken2-build&lt;/code&gt;, we can use &lt;code&gt;ncbi-genome-download&lt;/code&gt;&lt;sup id="fnref:ngd"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fn:ngd" rel="noopener noreferrer"&gt;3&lt;/a&gt;&lt;/sup&gt; tool to download the genomes. This provides much granular control over the download process. For example, we can download only &lt;code&gt;--assembly-levels complete&lt;/code&gt; genomes. We can also download multiple genomes in parallel.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pip install ncbi-genome-download

$ conda install -c bioconda ncbi-genome-download

$ ncbi-genome-download -s refseq -F fasta --parallel 40 -P archaea
Checking assemblies: 100%|███| 2184/2184 [00:19&amp;lt;00:00, 111.60entries/s]
Downloading assemblies: 100%|███| 2184/2184 [02:04&amp;lt;00:00, 4.54s/files]
Downloading assemblies: 2184files [02:23, 2184files/s]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In just 2 minutes, it has downloaded all the files. Lets gunzip the files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ find refseq -name "\*.gz" -print0 | parallel -0 gunzip

$ du -hs refseq
5.9G refseq

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lets add all fasta genome files to the custom database&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time find refseq -name "\*.fna" -exec kraken2-build --add-to-library {} --db custom_db \;
667.46s user 90.78s system 106% cpu 12:54.80 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;kraken2-build&lt;/code&gt; doesn't use multiple threads for adding genomes to the database. In addition to that, it also doesn't check if the genome is already present in the database.&lt;/p&gt;

&lt;p&gt;Let's use &lt;code&gt;k2&lt;/code&gt; for adding genomes to the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export KRAKEN\_NUM\_THREADS=40

$ find . -name "\*.fna" -exec k2 add-to-library --files {} --db custom_db \;
668.37s user 88.44s system 159% cpu 7:54.40 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This took only half the time compared to &lt;code&gt;kraken2-build&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let's build the index from the library.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2-build --db custom_db --build --threads 36
Creating sequence ID to taxonomy ID map (step 1)...
Found 0/125783 targets, searched through 60000000 accession IDs...
Found 59923/125783 targets, searched through 822105735 accession IDs, search complete.
lookup_accession_numbers: 65860/125783 accession numbers remain unmapped, see unmapped.txt in DB directory
Sequence ID to taxonomy ID map complete. [2m1.950s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 5340021028 bytes
Capacity estimation complete. [23.875s]
Building database files (step 3)...
Taxonomy parsed and converted.
CHT created with 11 bits reserved for taxid.
Completed processing of 59911 sequences, 3572145823 bp
Writing data to disk... complete.
Database files completed. [12m3.368s]
Database construction complete. [Total: 14m29.666s]
kraken2-build --db custom_db --build --threads 36 24534.98s user 90.50s system 2831% cpu 14:29.75 total

$ ls -ll
.rw-rw-r-- 5.3G anand 1 Aug 16:35 hash.k2d
drwxrwxr-x - anand 1 Aug 12:32 library
.rw-rw-r-- 64 anand 1 Aug 16:35 opts.k2d
.rw-rw-r-- 1.5M anand 1 Aug 16:22 seqid2taxid.map
.rw-rw-r-- 115k anand 1 Aug 16:23 taxo.k2d
lrwxrwxrwx 20 anand 1 Aug 12:31 taxonomy
.rw-rw-r-- 1.2M anand 1 Aug 16:22 unmapped.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We are able to build index for ~6GB input files in ~15 minutes.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;We learnt some useful tips to speed up the custom database creation process. In the next post, we will learn about regular vs. fast builds.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://ccb.jhu.edu/software/kraken2/" rel="noopener noreferrer"&gt;Kraken2&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fnref:k2" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://ftp.ncbi.nlm.nih.gov/genomes/refseq/archaea/" rel="noopener noreferrer"&gt;RefSeq Archaea genomes&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fnref:rag" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/kblin/ncbi-genome-download" rel="noopener noreferrer"&gt;https://github.com/kblin/ncbi-genome-download&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-build-custom-db.html#fnref:ngd" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devops</category>
      <category>kraken2</category>
      <category>metagenomics</category>
    </item>
    <item>
      <title>Mastering kraken2 - Part 2 - Performance Optimisation</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sun, 28 Jul 2024 05:21:30 +0000</pubDate>
      <link>https://dev.to/chillaranand/mastering-kraken2-part-2-performance-optimisation-1ln9</link>
      <guid>https://dev.to/chillaranand/mastering-kraken2-part-2-performance-optimisation-1ln9</guid>
      <description>&lt;h4&gt;
  
  
  Mastering Kraken2
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html" rel="noopener noreferrer"&gt;Part 1 - Initial Runs&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Part 2 - Performance Optimisation (this post)&lt;/p&gt;

&lt;p&gt;Part 3 - Custom Indices (upcoming)&lt;/p&gt;

&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;In the previous post, we learned how to set up kraken2&lt;sup id="fnref:k2"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fn:k2" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt;, download pre-built indices, and run kraken2. In this post, we will learn various ways to speed up the classification process.&lt;/p&gt;

&lt;h4&gt;
  
  
  Increasing RAM
&lt;/h4&gt;

&lt;p&gt;Kraken2 standard database is ~80GB in size. It is recommended to have at least db size RAM to run kraken2 efficiently&lt;sup id="fnref:ksr"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fn:ksr" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt;. Let's use 128GB RAM machine and run kraken2 with ERR10359977&lt;sup id="fnref:err"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fn:err" rel="noopener noreferrer"&gt;3&lt;/a&gt;&lt;/sup&gt; sample.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --report report.txt ERR10359977.fastq.gz &amp;gt; output.txt
Loading database information... done.
95064 sequences (14.35 Mbp) processed in 2.142s (2662.9 Kseq/m, 402.02 Mbp/m).
  94816 sequences classified (99.74%)
  248 sequences unclassified (0.26%)
kraken2 --db k2_standard --report report.txt ERR10359977.fastq.gz &amp;gt; 1.68s user 152.19s system 35% cpu 7:17.55 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the time taken has come down from 40 minutes to 7 minutes. The classification speed has also increased from 0.19 Mbp/m to 402.02 Mbp/m.&lt;/p&gt;

&lt;p&gt;The previous sample had only a few reads, and the speed is not a good indicator. Let's run kraken2 with a larger sample.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz &amp;gt; output.txt
Loading database information... done.
Processed 14980000 sequences (2972330207 bp) ...
17121245 sequences (3397.15 Mbp) processed in 797.424s (1288.2 Kseq/m, 255.61 Mbp/m).
  9826671 sequences classified (57.39%)
  7294574 sequences unclassified (42.61%)
kraken2 --db k2_standard --report report.txt --paired &amp;gt; output.txt 526.39s user 308.24s system 68% cpu 20:23.86 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This took almost 20 minutes to classify ~3 Gbp of data. Out of 20 minutes, 13 minutes was spent in classification. The remaining time in loading the db into memory.&lt;/p&gt;

&lt;p&gt;Let's use k2_plusPF&lt;sup id="fnref:k2p"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fn:k2p" rel="noopener noreferrer"&gt;4&lt;/a&gt;&lt;/sup&gt; db, which is twice the size of k2_standard and run kraken2.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_plusfp --report report.txt --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz &amp;gt; output.txt
Loading database information...done.
17121245 sequences (3397.15 Mbp) processed in 755.290s (1360.1 Kseq/m, 269.87 Mbp/m).
  9903824 sequences classified (57.85%)
  7217421 sequences unclassified (42.15%)
kraken2 --db k2_plusfp/ --report report.txt --paired SRR6915097_1.fastq.gz &amp;gt; 509.71s user 509.51s system 55% cpu 30:35.49 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This took ~30 minutes to complete, but the classification took only 13 minutes similar to k2_standard. The remaining time was spent in loading the db into memory.&lt;/p&gt;

&lt;h4&gt;
  
  
  Preloading db into RAM
&lt;/h4&gt;

&lt;p&gt;We can use vmtouch&lt;sup id="fnref:vmt"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fn:vmt" rel="noopener noreferrer"&gt;5&lt;/a&gt;&lt;/sup&gt; to preload db into RAM. kraken2 provides &lt;code&gt;--memory-mapping&lt;/code&gt; option to use preloaded db.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ vmtouch -vt k2_standard/hash.k2d k2_standard/opts.k2d k2_standard/taxo.k2d
           Files: 3
     Directories: 0
   Touched Pages: 20382075 (77G)
         Elapsed: 434.77 seconds

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's run kraken2 with &lt;code&gt;--memory-mapping&lt;/code&gt; option.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --report report.txt --memory-mapping --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz &amp;gt; output.txt
Loading database information... done.
17121245 sequences (3397.15 Mbp) processed in 532.486s (1929.2 Kseq/m, 382.79 Mbp/m).
  9826671 sequences classified (57.39%)
  7294574 sequences unclassified (42.61%)
  kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq.gz &amp;gt; 424.20s user 11.76s system 81% cpu 8:54.98 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the classification took only ~10 minutes.&lt;/p&gt;

&lt;h4&gt;
  
  
  Multi threading
&lt;/h4&gt;

&lt;p&gt;kraken2 supports multiple threads. I am using a machine with 40 threads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz --memory-mapping --threads 32 &amp;gt; output.txt
Loading database information... done.
17121245 sequences (3397.15 Mbp) processed in 71.675s (14332.5 Kseq/m, 2843.81 Mbp/m).
  9826671 sequences classified (57.39%)
  7294574 sequences unclassified (42.61%)
kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq.gz 556.58s user 22.85s system 762% cpu 1:16.02 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With 32 threads, the classification took only 1 minute. Beyond 32 threads, the classification time did not decrease significantly.&lt;/p&gt;

&lt;h4&gt;
  
  
  Optimising input files
&lt;/h4&gt;

&lt;p&gt;So far we have used gzipped input files. Let's use unzipped input files and run kraken2.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ gunzip SRR6915097_1.fastq.gz
$ gunzip SRR6915097_2.fastq.gz

$ time kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq SRR6915097_2.fastq --memory-mapping --threads 30 &amp;gt; output.txt
Loading database information... done.
17121245 sequences (3397.15 Mbp) processed in 34.809s (29512.0 Kseq/m, 5855.68 Mbp/m).
  9826671 sequences classified (57.39%)
  7294574 sequences unclassified (42.61%)
kraken2 --db k2_standard --report report.txt --paired SRR6915097_1.fastq 30 565.03s user 17.12s system 1530% cpu 38.047 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the classification time has come down to 40 seconds.&lt;/p&gt;

&lt;p&gt;Since the input fastq files are paired, interleaving the files also takes time. Lets interleave the files and run kraken2.&lt;/p&gt;

&lt;p&gt;To interleave the files, lets use &lt;code&gt;seqfu&lt;/code&gt; tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ conda install -y -c conda-forge -c bioconda "seqfu&amp;gt;1.10"

$ seqfu interleave -1 SRR6915097_1.fastq.gz -2 SRR6915097_2.fastq.gz &amp;gt; SRR6915097.fastq

$ time kraken2 --db k2_standard --report report.txt --memory-mapping SRR6915097.fq --threads 32 &amp;gt; output.txt
Loading database information... done.
34242490 sequences (3397.15 Mbp) processed in 20.199s (101714.1 Kseq/m, 10090.91 Mbp/m).
  17983321 sequences classified (52.52%)
  16259169 sequences unclassified (47.48%)
kraken2 --db k2_standard --report report.txt --memory-mapping SRR6915097.fq 32 618.96s user 18.24s system 2653% cpu 24.013 total

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the classification time has come down to 24 seconds.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;In terms of classification speed, we have come a long way from 0.1 Mbp/m to 1200 Mbp/m. In the next post, we will learn how to optimise the creation of custom indices.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://ccb.jhu.edu/software/kraken2/" rel="noopener noreferrer"&gt;Kraken2&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fnref:k2" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/DerrickWood/kraken2/blob/master/docs/MANUAL.markdown#system-requirements" rel="noopener noreferrer"&gt;Kraken System Requirements&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fnref:ksr" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="//ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR103/077/ERR10359977/ERR10359977.fastq.gz"&gt;ERR10359977.fastq.gz&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fnref:err" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://benlangmead.github.io/aws-indexes/k2" rel="noopener noreferrer"&gt;Genomic Index Zone - k2&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fnref:k2p" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://hoytech.com/vmtouch/" rel="noopener noreferrer"&gt;https://hoytech.com/vmtouch/&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html#fnref:vmt" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devops</category>
      <category>kraken2</category>
      <category>metagenomics</category>
    </item>
    <item>
      <title>Mastering Kraken2 - Part 1 - Initial Runs</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sun, 28 Jul 2024 05:14:25 +0000</pubDate>
      <link>https://dev.to/chillaranand/mastering-kraken2-part-1-initial-runs-ajl</link>
      <guid>https://dev.to/chillaranand/mastering-kraken2-part-1-initial-runs-ajl</guid>
      <description>&lt;h4&gt;
  
  
  Mastering Kraken2
&lt;/h4&gt;

&lt;p&gt;Part 1 - Initial Runs (this post)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html" rel="noopener noreferrer"&gt;Part 2 - Performance Optimisation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Part 3 - Custom Indices (upcoming)&lt;/p&gt;

&lt;h4&gt;
  
  
  Introduction
&lt;/h4&gt;

&lt;p&gt;Kraken2&lt;sup id="fnref:Kraken2"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fn:Kraken2" rel="noopener noreferrer"&gt;1&lt;/a&gt;&lt;/sup&gt; is widely used for metagenomics taxonomic classification, and it has pre-built indexes for many organisms. In this series, we will learn&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to set up kraken2, download pre-built indices&lt;/li&gt;
&lt;li&gt;Run kraken2 (8GB RAM) at ~0.19 Mbp/m (million base pairs per minute)&lt;/li&gt;
&lt;li&gt;Learn various ways to speed up the classification process&lt;/li&gt;
&lt;li&gt;Run kraken2 (128GB RAM) at ~1200 Mbp/m&lt;/li&gt;
&lt;li&gt;Build custom indices&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Installation
&lt;/h4&gt;

&lt;p&gt;Let's start with an 8 GB RAM machine. We can install kraken2 using the &lt;code&gt;install_kraken2.sh&lt;/code&gt; script as per the manual&lt;sup id="fnref:install_kraken2"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fn:install_kraken2" rel="noopener noreferrer"&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ git clone https://github.com/DerrickWood/kraken2
$ cd kraken2
$ ./install_kraken2.sh /usr/local/bin
# ensure kraken2 is in the PATH
$ export PATH=$PATH:/usr/local/bin

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you already have conda installed, you can install kraken2 from conda as well.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ conda install -c bioconda kraken2

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Download pre-built indices
&lt;/h4&gt;

&lt;p&gt;Building kraken2 indices take a lot of time and resources. For now, let's download and use the pre-built indices. In the final post, we will learn how to build the indices.&lt;/p&gt;

&lt;p&gt;Genomic Index Zone&lt;sup id="fnref:GenomicIndexZone"&gt;&lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fn:GenomicIndexZone" rel="noopener noreferrer"&gt;3&lt;/a&gt;&lt;/sup&gt; provides pre-built indices for kraken2. Let's download the standard database. It contains Refeq archaea, bacteria, viral, plasmid, human1, &amp;amp; UniVec_Core.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ wget https://genome-idx.s3.amazonaws.com/kraken/k2_standard_20240605.tar.gz
$ mkdir k2_standard
$ tar -xvf k2_standard_20240605.tar.gz -C k2_standard

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The extracted directory contains three files - &lt;code&gt;hash.k2d&lt;/code&gt;, &lt;code&gt;opts.k2d&lt;/code&gt;, &lt;code&gt;taxo.k2d&lt;/code&gt; which are the kraken2 database files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ ls -l *.k2d
.rw-r--r-- 83G anand 13 Jul 12:34 hash.k2d
.rw-r--r-- 64 anand 13 Jul 12:34 opts.k2d
.rw-r--r-- 4.0M anand 13 Jul 12:34 taxo.k2d

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Classification
&lt;/h4&gt;

&lt;p&gt;To run the taxonomic classification, let's use &lt;code&gt;ERR10359977&lt;/code&gt; human gut meta genome from NCBI SRA.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ wget https://ftp.sra.ebi.ac.uk/vol1/fastq/ERR103/077/ERR10359977/ERR10359977.fastq.gz
$ kraken2 --db k2_standard --report report.txt ERR10359977.fastq.gz &amp;gt; output.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, the machine I have used has 8GB RAM and an additioinal 8GB swap. Since kraken2 needs entire db(~80GB) in memory, when the process tries to consume more than 16GB memory, the kernel will kill the process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz &amp;gt; output.txt
Loading database information...Command terminated by signal 9
0.02user 275.83system 8:17.43elapsed 55%CPU 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To prevent this, let's increase the swap space to 128 GB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Create an empty swapfile of 128GB
sudo dd if=/dev/zero of=/swapfile bs=1G count=128

# Turn swap off - It might take several minutes
sudo swapoff -a

# Set the permissions for swapfile
sudo chmod 0600 /swapfile

# make it a swap area
sudo mkswap /swapfile  

# Turn the swap on
sudo swapon /swapfile

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can time the classification process using the &lt;code&gt;time&lt;/code&gt; command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time kraken2 --db k2_standard --report report.txt ERR10359977.fastq.gz &amp;gt; output.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have a machine with large RAM, the same scenario can be simulated using &lt;code&gt;systemd-run&lt;/code&gt;. This will limit the memory usage of kraken2 to 6.5GB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time systemd-run --scope -p MemoryMax=6.5G --user time kraken2 --db k2_standard --report report.txt ERR10359977.fastq.gz &amp;gt; output.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Depending on the CPU performance, this will take around ~40 minutes to complete.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Loading database information... done.
95064 sequences (14.35 Mbp) processed in 1026.994s (5.6 Kseq/m, 0.84 Mbp/m).
  94939 sequences classified (99.87%)
  125 sequences unclassified (0.13%)
  4.24user 658.68system 38:26.78elapsed 28%CPU 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we try gut WGS(Whole Genome Sequence) sample like &lt;code&gt;SRR6915097&lt;/code&gt; which contains ~3.3 Gbp, it will take weeks to complete.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ time systemd-run --scope -p MemoryMax=6G --user time kraken2 --db k2_standard --paired SRR6915097_1.fastq.gz SRR6915097_2.fastq.gz &amp;gt; output.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I tried running this on 8 GB machine. Even after 10 days, it processed only 10% of the data.&lt;/p&gt;

&lt;p&gt;If we have to process a large number of such samples, it takes months and this is not a practical solution.&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;In this post, we ran kraken2 on an 8GB machine and learned that it is not feasible to run kraken2 on large samples.&lt;/p&gt;

&lt;p&gt;In the next post, we will learn how to speed up the classification process and run classification at 1200 Mbp/m.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next&lt;/strong&gt; : &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-performance-optimisation.html" rel="noopener noreferrer"&gt;Part 2 - Performance Optimisation&lt;/a&gt;&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://ccb.jhu.edu/software/kraken2/" rel="noopener noreferrer"&gt;Kraken2&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fnref:Kraken2" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/DerrickWood/kraken2/blob/master/docs/MANUAL.markdown#installation" rel="noopener noreferrer"&gt;Kraken2 - Manual - Install&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fnref:install_kraken2" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://benlangmead.github.io/aws-indexes/k2" rel="noopener noreferrer"&gt;Genomic Index Zone - k2&lt;/a&gt; &lt;a href="https://avilpage.com/2024/07/mastering-kraken2-initial-runs.html#fnref:GenomicIndexZone" rel="noopener noreferrer"&gt;↩&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devops</category>
      <category>kraken2</category>
      <category>metagenomics</category>
    </item>
    <item>
      <title>Headlamp - k8s Lens open source alternative</title>
      <dc:creator>Chillar Anand</dc:creator>
      <pubDate>Sun, 23 Jun 2024 20:18:02 +0000</pubDate>
      <link>https://dev.to/chillaranand/headlamp-k8s-lens-open-source-alternative-4do1</link>
      <guid>https://dev.to/chillaranand/headlamp-k8s-lens-open-source-alternative-4do1</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JYizoOsi--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/headlamp-k8s-lens-open-source-alternative.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JYizoOsi--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/headlamp-k8s-lens-open-source-alternative.png" alt="headlamp - Open source Kubernetes Lens alternator" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since Lens is not open source, I tried out monokle, octant, k9s, and headlamp&lt;sup id="fnref:headlamp"&gt;&lt;a href="https://avilpage.com/2024/06/headlamp-k8s-lens-open-source-alternative.html#fn:headlamp"&gt;1&lt;/a&gt;&lt;/sup&gt;. Among them, headlamp UI &amp;amp; features are closest to Lens.&lt;/p&gt;

&lt;h4&gt;
  
  
  Headlamp
&lt;/h4&gt;

&lt;p&gt;Headlamp is CNCF sandbox project that provides cross-platform desktop application to manage Kubernetes clusters. It auto-detects clusters and provides cluster wide resource usage by default.&lt;/p&gt;

&lt;p&gt;It can also be installed inside the cluster and can be accessed using a web browser. This is useful when we want to access the cluster from a mobile device.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm repo add headlamp https://headlamp-k8s.github.io/headlamp/

$ helm install headlamp headlamp/headlamp

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lets port-forward the service &amp;amp; copy the token to access it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl create token headlamp

# we can do this via headlamp UI as well
$ kubectl port-forward service/headlamp 8080:80

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we can access the headlamp UI at &lt;a href="https://dev.tohttp://"&gt;http://localhost:8080&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yczmjwp_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/headlamp-k8s-lens-open-source-alternative2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yczmjwp_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://avilpage.com/images/headlamp-k8s-lens-open-source-alternative2.png" alt="headlamp - Open source Kubernetes Lens alternator" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Conclusion
&lt;/h4&gt;

&lt;p&gt;If you are looking for an open source alternative to Lens, headlamp is a good choice. It provides a similar UI &amp;amp; features as Lens and it is accessible via mobile devices as well.&lt;/p&gt;




&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://headlamp.dev/"&gt;https://headlamp.dev/&lt;/a&gt; &lt;a href="https://avilpage.com/2024/06/headlamp-k8s-lens-open-source-alternative.html#fnref:headlamp"&gt;↩&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
