<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dhanush Reddy</title>
    <description>The latest articles on DEV Community by Dhanush Reddy (@dhanushreddy29).</description>
    <link>https://dev.to/dhanushreddy29</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F598732%2F56ce88ab-55f8-4faf-b93b-272e5b288eed.jpg</url>
      <title>DEV Community: Dhanush Reddy</title>
      <link>https://dev.to/dhanushreddy29</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dhanushreddy29"/>
    <language>en</language>
    <item>
      <title>Getting started with Thordata</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 27 Oct 2025 17:54:59 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/getting-started-with-thordata-5ggn</link>
      <guid>https://dev.to/dhanushreddy29/getting-started-with-thordata-5ggn</guid>
      <description>&lt;p&gt;If you've ever tried to access information from different parts of the world or manage multiple accounts, you might have run into some roadblocks. Websites can change their content based on your location, and some might even block you. That's where a service like &lt;a href="https://www.thordata.com/?ls=EDBORvrR&amp;amp;lk=wb" rel="noopener noreferrer"&gt;Thordata&lt;/a&gt; comes in.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Thordata?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.thordata.com/?ls=EDBORvrR&amp;amp;lk=wb" rel="noopener noreferrer"&gt;Thordata&lt;/a&gt; is a service that provides access to a large network of residential proxies. In simple terms, it lets you use IP addresses from real devices in over 195 countries. This makes it look like your internet traffic is coming from a regular user in that location, which can be useful for a variety of tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Features of Thordata
&lt;/h2&gt;

&lt;p&gt;Here are some of the things you can do with Thordata:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Global Coverage:&lt;/strong&gt; With over 60 million IP addresses, you can access content from all over the world.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Different Proxy Types:&lt;/strong&gt; Thordata offers a few different kinds of proxies to fit your needs, including &lt;a href="https://www.thordata.com/products/residential-proxies" rel="noopener noreferrer"&gt;residential, ISP, and datacenter proxies&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Flexible Sessions:&lt;/strong&gt; You can choose to have your IP address change with every request ("rotating" sessions) or keep the same IP for a longer period ("sticky" sessions). This is helpful for tasks that require a consistent identity.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Easy Setup:&lt;/strong&gt; You can get started quickly with their endpoint generator, which automatically creates the credentials you need. You can also manually create users or use an IP allowlist for extra security.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Usage Tracking:&lt;/strong&gt; The dashboard includes a statistics tab where you can monitor your data usage in real-time.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to Use Thordata
&lt;/h2&gt;

&lt;p&gt;Getting set up with Thordata is pretty straightforward. Here's a quick overview of the process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Choose a Plan:&lt;/strong&gt; First, you'll need to select a &lt;a href="https://www.thordata.com/pricing" rel="noopener noreferrer"&gt;pricing plan&lt;/a&gt; that works for you.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Set Up Your Proxies:&lt;/strong&gt; In the dashboard, you can use the endpoint generator to create your proxy credentials.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Configure Your Location and Session Type:&lt;/strong&gt; You can choose a random IP from their global pool or select a specific country. Then, decide if you want a rotating or sticky session.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Copy and Integrate:&lt;/strong&gt; Once you've set your preferences, you can copy the proxy details and integrate them into your application or browser.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're interested in giving it a try, &lt;a href="https://www.thordata.com/?ls=EDBORvrR&amp;amp;lk=wb" rel="noopener noreferrer"&gt;Thordata offers a free trial&lt;/a&gt; to help you get started. You can use their residential proxies for tasks like web scraping, ad verification etc.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The Daily Gist: Never miss the best of Reddit again. Powered by Bright Data.</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 01 Sep 2025 05:24:10 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/the-daily-gist-never-miss-the-best-of-reddit-again-powered-by-bright-data-5238</link>
      <guid>https://dev.to/dhanushreddy29/the-daily-gist-never-miss-the-best-of-reddit-again-powered-by-bright-data-5238</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/brightdata-n8n-2025-08-13"&gt;AI Agents Challenge powered by n8n and Bright Data&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;The Daily Gist is an n8n workflow that creates a 10- to 15-minute video summary of a subreddit's top posts. To do this, it uses the subreddit's RSS feed.&lt;/p&gt;

&lt;p&gt;The following video is an example generated by my n8n workflow for the subreddit &lt;a href="https://www.reddit.com/r/ArtificialInteligence/" rel="noopener noreferrer"&gt;r/ArtificialInteligence&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/Zeh6Lshj04A"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  n8n Workflow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://gist.github.com/dhanushreddy291/ccd40f033355e74e04b8cf0a3058047b" rel="noopener noreferrer"&gt;Github Gist&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Implementation
&lt;/h2&gt;

&lt;p&gt;This workflow is a scheduled job, built with n8n, that creates a video summary of the top posts from a chosen subreddit everyday.&lt;/p&gt;

&lt;p&gt;The workflow operates in several distinct stages:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3orcg0e8s5ofgdj8dk1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3orcg0e8s5ofgdj8dk1.png" alt="n8n Workflow Part 1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjnsyg65m95um8vmvi21f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjnsyg65m95um8vmvi21f.png" alt="n8n Workflow Part 2"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5r3mtacl7amqm1qh1lj5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5r3mtacl7amqm1qh1lj5.png" alt="n8n Workflow Part 3"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Data Ingestion:&lt;/strong&gt; The process begins by ingesting post URLs from the subreddit's RSS feed.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Content Scraping &amp;amp; Ranking:&lt;/strong&gt; These URLs are passed to a Bright Data node, which scrapes the content of each post. The workflow then filters and ranks the posts, selecting the top 15 based on a combination of upvotes and comments.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;AI-Powered Content Generation:&lt;/strong&gt; For each of the top posts, the system utilizes two distinct AI models from Gemini:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visuals:&lt;/strong&gt; An image generation model, &lt;a href="https://developers.googleblog.com/en/introducing-gemini-2-5-flash-image/" rel="noopener noreferrer"&gt;Gemini 2.5 Flash Image, a.k.a Nano Banana&lt;/a&gt;, creates a unique, context-aware image for the post.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio:&lt;/strong&gt; The audio generation is a two-step process: first, the &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash" rel="noopener noreferrer"&gt;Gemini 2.5 Flash&lt;/a&gt; LLM creates a text-based summary. This text is then converted into speech by the specialized &lt;a href="https://cloud.google.com/text-to-speech/docs/gemini-tts" rel="noopener noreferrer"&gt;Gemini 2.5 Flash Preview TTS model&lt;/a&gt; for the final audio narration.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Video Synthesis:&lt;/strong&gt; With a unique image and audio track for each post, an &lt;code&gt;ffmpeg&lt;/code&gt; command is executed to sequence and merge these elements into a final, consolidated MP4 video.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In my case, the completed video is saved to a shared storage volume accessible by a &lt;a href="https://jellyfin.org/" rel="noopener noreferrer"&gt;Jellyfin media server&lt;/a&gt; running in a parallel container. This setup allows me to seamlessly stream and watch my personalized Reddit summary on any of my devices. While this is great for personal viewing, the workflow can be easily extended using n8n's built-in integrations to automatically upload the video to YouTube, send it via Telegram, or distribute it to virtually any other platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bright Data Verified Node
&lt;/h3&gt;

&lt;p&gt;The core data extraction was handled by Bright Data's Web Scraper node. The workflow begins by ingesting URLs from the RSS feed of the &lt;a href="https://www.reddit.com/r/ArtificialInteligence.rss" rel="noopener noreferrer"&gt;r/ArtificialInteligence&lt;/a&gt; subreddit. These URLs are then processed in a batch by the Bright Data node.&lt;/p&gt;

&lt;p&gt;Scraping websites like Reddit poses a significant challenge. The content is loaded dynamically, and the HTML structure is complex, making traditional scraping with simple CSS selectors unreliable and difficult to maintain.&lt;/p&gt;

&lt;p&gt;This is where Bright Data made it very easy. The node effortlessly extracted key information such as post titles, upvote counts, authors, and replies. The use of Bright Data node was critical, as it provides a managed solution that guarantees a structured response, eliminating the need to build and maintain a complex, fragile custom parser for Reddit's front-end.&lt;/p&gt;

&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;p&gt;Coming from a coding background, I was eager to see what n8n could do. The transition to n8n's visual workflow builder was seamless, and I was able to get up to speed very quickly.&lt;/p&gt;

&lt;p&gt;The main technical challenge I encountered was a missing dependency in the standard deployment. The default self-hosted n8n image lacks FFmpeg, which was essential for my video automation workflow. I resolved this by building and publishing my own custom n8n image with FFmpeg included, which is now available for the community on GitHub: &lt;a href="https://github.com/dhanushreddy291/n8n-with-ffmpeg" rel="noopener noreferrer"&gt;dhanushreddy291/n8n-with-ffmpeg&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This experience was incredibly valuable. It taught me that n8n's true power is the incredible speed at which you can build and deploy. Thanks to the extensive library of integrations, complex tasks that would normally require significant coding like creating custom APIs or setting up cron jobs, can be accomplished in a matter of minutes. This ability to go from idea to a functional workflow so quickly has made it my go to solution from now on.&lt;/p&gt;

&lt;p&gt;This submission was made by &lt;a href="https://dev.to/dhanushreddy29"&gt;Dhanush Reddy&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>n8nbrightdatachallenge</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>NewsSnap: Techy+bite-sized news in minutes</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 27 Jan 2025 07:58:19 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/researchbyte-techybite-sized-videos-in-minutes-3a1l</link>
      <guid>https://dev.to/dhanushreddy29/researchbyte-techybite-sized-videos-in-minutes-3a1l</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://srv.buysellads.com/ads/long/x/T6EK3TDFTTTTTT6WWB6C5TTTTTTGBRAPKATTTTTTWTFVT7YTTTTTTKPPKJFH4LJNPYYNNSZL2QLCE2DPPQVCEI45GHBT" rel="noopener noreferrer"&gt;Agent.ai&lt;/a&gt; Challenge: Productivity-Pro Agent (&lt;a href="https://dev.to/challenges/agentai"&gt;See Details&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;NewsSnap empowers users to stay informed with lightning-fast, comprehensive research. It instantly scrapes and synthesizes data from across the internet to create concise, visually engaging 5-minute video summaries of topics, companies, or the past 7 days' news. Perfect for busy professionals, students, or anyone curious about the world, NewsSnap delivers reliable insights at your fingertips. Whether you're preparing for a meeting, studying, or catching up on the latest trends, NewsSnap transforms overwhelming information into clear, engaging videos—keeping you up-to-date with speed and simplicity.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;To clarify, the news video is exclusively generated using information from the past 7 days from Google News, focusing solely on the specified topic or company as the context.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://agent.ai/agent/research-byte" rel="noopener noreferrer"&gt;https://agent.ai/agent/research-byte&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's a sample video generated by Newssnap with just the prompt: "Deepseek"&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/SAkjSwBd6Rs"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Here's another video with the prompt "Sam Altman"&lt;br&gt;
&lt;a href="https://agent.ai/agent/research-byte?rid=426f90b022b94c3b8351b0dfbd1918a2" rel="noopener noreferrer"&gt;https://agent.ai/agent/research-byte?rid=426f90b022b94c3b8351b0dfbd1918a2&lt;/a&gt;&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/joaKRvbRHDM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent.ai Experience
&lt;/h2&gt;

&lt;p&gt;While the agent builder offers an intuitive interface for basic tasks, I have encountered several challenges when developing more complex workflows. Specifically, AWS Lambda deployments have consistently failed without clear error messages, necessitating deployment to a personal server and invocation via &lt;code&gt;POST&lt;/code&gt; requests. Additionally, I have experienced issues with &lt;code&gt;FOR&lt;/code&gt; loops within the agent builder, and the debugging tools have not provided sufficient diagnostic information. Furthermore, the handling and formatting of JSON data requires further clarification. I have seen others using GPT-4 to parse JSON outputs, which should be done actually via code, but as there was no option to debug properly I understand that.&lt;/p&gt;

&lt;p&gt;Despite these challenges, I recognize the platform's potential. With further development and resolution of these issues, the agent builder could empower users of all skill levels to rapidly create AI applications. It has the potential to become a central hub for AI agent orchestration, similar to Zapier, enabling agents to trigger external integrations based on their internal logic.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>agentaichallenge</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>NewsSnap: Techy+bite-sized news in minutes</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 27 Jan 2025 07:58:14 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/researchbyte-techy-bite-sized-videos-in-minutes-2ef7</link>
      <guid>https://dev.to/dhanushreddy29/researchbyte-techy-bite-sized-videos-in-minutes-2ef7</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://srv.buysellads.com/ads/long/x/T6EK3TDFTTTTTT6WWB6C5TTTTTTGBRAPKATTTTTTWTFVT7YTTTTTTKPPKJFH4LJNPYYNNSZL2QLCE2DPPQVCEI45GHBT" rel="noopener noreferrer"&gt;Agent.ai&lt;/a&gt; Challenge: Full-Stack Agent (&lt;a href="https://dev.to/challenges/agentai"&gt;See Details&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;NewsSnap empowers users to stay informed with lightning-fast, comprehensive research. It instantly scrapes and synthesizes data from across the internet to create concise, visually engaging 5-minute video summaries of topics, companies, or the past 7 days' news. Perfect for busy professionals, students, or anyone curious about the world, NewsSnap delivers reliable insights at your fingertips. Whether you're preparing for a meeting, studying, or catching up on the latest trends, NewsSnap transforms overwhelming information into clear, engaging videos—keeping you up-to-date with speed and simplicity.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;To clarify, the news video is exclusively generated using information from the past 7 days from Google News, focusing solely on the specified topic or company as the context.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://agent.ai/agent/research-byte" rel="noopener noreferrer"&gt;https://agent.ai/agent/research-byte&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's a sample video generated by Newssnap with just the prompt: "Deepseek"&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/SAkjSwBd6Rs"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Here's another video with the prompt "Sam Altman"&lt;br&gt;
&lt;a href="https://agent.ai/agent/research-byte?rid=426f90b022b94c3b8351b0dfbd1918a2" rel="noopener noreferrer"&gt;https://agent.ai/agent/research-byte?rid=426f90b022b94c3b8351b0dfbd1918a2&lt;/a&gt;&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/joaKRvbRHDM"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent.ai Experience
&lt;/h2&gt;

&lt;p&gt;While the agent builder offers an intuitive interface for basic tasks, I have encountered several challenges when developing more complex workflows. Specifically, AWS Lambda deployments have consistently failed without clear error messages, necessitating deployment to a personal server and invocation via &lt;code&gt;POST&lt;/code&gt; requests. Additionally, I have experienced issues with &lt;code&gt;FOR&lt;/code&gt; loops within the agent builder, and the debugging tools have not provided sufficient diagnostic information. Furthermore, the handling and formatting of JSON data requires further clarification. I have seen others using GPT-4 to parse JSON outputs, which should be done actually via code, but as there was no option to debug properly I understand that.&lt;/p&gt;

&lt;p&gt;Despite these challenges, I recognize the platform's potential. With further development and resolution of these issues, the agent builder could empower users of all skill levels to rapidly create AI applications. It has the potential to become a central hub for AI agent orchestration, similar to Zapier, enabling agents to trigger external integrations based on their internal logic.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>agentaichallenge</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Reddit Recap: Audio summaries of subreddits powered by BrightData</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 30 Dec 2024 06:33:50 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/reddit-recap-3j6d</link>
      <guid>https://dev.to/dhanushreddy29/reddit-recap-3j6d</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/brightdata"&gt;Bright Data Web Scraping Challenge&lt;/a&gt;: Most Creative Use of Web Data for AI Models&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://reddit-recap-woad.vercel.app" rel="noopener noreferrer"&gt;Reddit Recap&lt;/a&gt; is an application that scrapes subreddits using &lt;a href="https://brightdata.com" rel="noopener noreferrer"&gt;BrightData&lt;/a&gt; and generates concise summaries every two hours. These summaries are then converted into audio briefings, all accessible through a beautiful web app, allowing users to effortlessly stay informed about their favorite communities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Built It
&lt;/h2&gt;

&lt;p&gt;I wanted to tackle a personal problem I've faced: staying up-to-date with the latest discussions and news in the communities I care about. While Reddit offers an incredible wealth of discussions, the sheer volume of content became overwhelming. That's why I created Reddit Recap—a tool that distills the platform's endless stream of information into digestible, curated updates, helping me stay connected to the conversations that matter most to me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;Check out Reddit Recap &lt;a href="https://reddit-recap-woad.vercel.app" rel="noopener noreferrer"&gt;here&lt;/a&gt;. While I've customized the current deployment to track subreddits that match my interests (&lt;a href="https://www.reddit.com/r/singularity/" rel="noopener noreferrer"&gt;r/singularity&lt;/a&gt;, &lt;a href="https://www.reddit.com/r/LocalLLaMA/" rel="noopener noreferrer"&gt;r/LocalLLaMA&lt;/a&gt;, and &lt;a href="https://www.reddit.com/r/homeautomation/" rel="noopener noreferrer"&gt;r/homeautomation&lt;/a&gt;), you can easily create your own version by using the source code to monitor the communities you care about.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27qp0nxmju5gd0aes14w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27qp0nxmju5gd0aes14w.png" alt="Reddit Recap" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Used Bright Data
&lt;/h2&gt;

&lt;p&gt;Bright Data was absolutely essential for building Reddit Recap. Scraping Reddit is incredibly challenging due to its sophisticated anti-scraping mechanisms. I leveraged BrightData's &lt;a href="https://brightdata.com/products/web-scraper" rel="noopener noreferrer"&gt;Web Scraper API&lt;/a&gt; for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reliable Data Extraction: Reddit Dataset (&lt;a href="https://brightdata.com/cp/data_api/gd_lvz8ah06191smkebj4/subreddit_url?tab=overview" rel="noopener noreferrer"&gt;gd_lvz8ah06191smkebj4&lt;/a&gt;) provided structured and dependable access to Reddit posts, eliminating the need to build and maintain my own complex scraping infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bypassing Anti-Scraping Measures: Bright Data's infrastructure seamlessly handles IP blocking, CAPTCHAs, and other anti-scraping techniques that would cripple traditional scrapers. This allowed me to focus on the application's core logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Efficient Data Retrieval: The Bright Data API made it easy to target specific subreddits and retrieve the latest top posts in a structured format, saving significant development time.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is a high level architectural overview of the app&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff40abq8xtn740g6g9jee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff40abq8xtn740g6g9jee.png" alt="Architecture overview" width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The web app can also qualify under: &lt;em&gt;Prompt 1: Scrape Data from Complex, Interactive Websites&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Benefits of Reddit Recap
&lt;/h3&gt;

&lt;p&gt;Reddit Recap offers several key advantages for busy individuals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Stay Informed Effortlessly:&lt;/strong&gt;  No more endless scrolling! Get the gist of what's happening in your favorite subreddits in minutes.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Audio Summaries on the Go:&lt;/strong&gt;  Listen to your Reddit news during your commute, workout, or while doing chores.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Time Savings:&lt;/strong&gt; Reclaim valuable time by quickly catching up on relevant discussions.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Clean and Organized Presentation:&lt;/strong&gt;  The web app provides a clear and easy-to-navigate interface for accessing the summaries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This submission was made by &lt;a href="https://dev.to/dhanushreddy29"&gt;Dhanush Reddy&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;

&lt;p&gt;You can find complete code &lt;a href="https://github.com/dhanushreddy291/reddit-recap" rel="noopener noreferrer"&gt;here&lt;/a&gt;, feel free to fork it and customise as per your subreddit interests&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>brightdatachallenge</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How to create custom nodes in ComfyUI</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Sun, 28 Jul 2024 11:16:05 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/how-to-create-custom-nodes-in-comfyui-bgh</link>
      <guid>https://dev.to/dhanushreddy29/how-to-create-custom-nodes-in-comfyui-bgh</guid>
      <description>&lt;h2&gt;
  
  
  What is ComfyUI?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/comfyanonymous/ComfyUI" rel="noopener noreferrer"&gt;ComfyUI&lt;/a&gt; is a powerful and flexible user interface for Stable Diffusion, allowing users to create complex image generation workflows through a node-based system. While ComfyUI comes with a variety of built-in nodes, its true strength lies in its extensibility. Custom nodes enable users to add new functionality, integrate external services, and tailor it to their specific needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fft1k06djbvvgfnuas7za.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fft1k06djbvvgfnuas7za.gif" alt="An image showing the interface and working of ComfyUI" width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this blog post, we will walk through the process of creating a custom node for image captioning using ComfyUI. This node will take an image as input and return a generated caption using an external API.&lt;/p&gt;

&lt;p&gt;We will be using &lt;a href="https://ai.google.dev/" rel="noopener noreferrer"&gt;Google Gemini API&lt;/a&gt; for generating the caption of an image.&lt;/p&gt;

&lt;p&gt;So here is the entire code which does the ImageCaptioning using Gemini API.&lt;/p&gt;

&lt;p&gt;You can copy the following code into any file under the &lt;code&gt;custom_nodes&lt;/code&gt; folder in ComfyUI, I have named mine as &lt;code&gt;gemini-caption.py&lt;/code&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F638alm29v36z0gh9yvl7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F638alm29v36z0gh9yvl7.png" alt="Where to store the file" width="278" height="98"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Complete code for Generating Image Captions
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;INPUT_TYPES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IMAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;})}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;RETURN_TYPES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;
    &lt;span class="n"&gt;FUNCTION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;caption_image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;CATEGORY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;OUTPUT_NODE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;caption_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Convert the image tensor to a PIL Image
&lt;/span&gt;        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;255.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;squeeze&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uint8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Convert the image to base64
&lt;/span&gt;        &lt;span class="n"&gt;buffered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PNG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;api_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generate a caption for this image in as detail as possible. Don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t send anything else apart from the caption.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="p"&gt;},&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inline_data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mime_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;img_str&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
                    &lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;# Send the request to the Gemini API
&lt;/span&gt;        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;candidates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exceptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: Unable to generate caption. &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;


&lt;span class="n"&gt;NODE_CLASS_MAPPINGS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Here is how the node looks on the UI:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55d5kbqmjuhqbf601ifh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F55d5kbqmjuhqbf601ifh.png" alt="Custom ComfyUI Node" width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's go over it line by line, to get an understanding how do we go about creating a similar node for your use case. First of all whatever node you want to create, make it as a function, so you can call it just in the same way in ComfyUI, as I did here for my &lt;code&gt;caption_image&lt;/code&gt; function.&lt;/p&gt;
&lt;h3&gt;
  
  
  Import the necessary libraries needed
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;These lines import the necessary libraries for my Image Captioning node:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;numpy&lt;/code&gt; for numerical operations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PIL&lt;/code&gt; (Python Imaging Library) for image processing&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;requests&lt;/code&gt; for making HTTP requests to Gemini API&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;io&lt;/code&gt; for handling byte streams&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;base64&lt;/code&gt; for encoding the image&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Defining the ClassName for your ComfyUI node
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nd"&gt;@classmethod&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;INPUT_TYPES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IMAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;})}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;In my case, I have named it as ImageCaptioningNode as it does what is says.&lt;/p&gt;

&lt;p&gt;The class method defines the input types for our node:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An "image" input of type "IMAGE"&lt;/li&gt;
&lt;li&gt;An "api_key" input of type "STRING" with a default empty value, needed for sending API requests to Gemini API.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="n"&gt;RETURN_TYPES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;
    &lt;span class="n"&gt;FUNCTION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;caption_image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;CATEGORY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;OUTPUT_NODE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;These class variables define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The return type (a string)&lt;/li&gt;
&lt;li&gt;The main function to be called ("caption_image")&lt;/li&gt;
&lt;li&gt;The category in which the node will appear in ComfyUI&lt;/li&gt;
&lt;li&gt;That this node can be an output node
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;caption_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Convert the image tensor to a PIL Image
&lt;/span&gt;        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;255.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;numpy&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;squeeze&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uint8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Convert the image to base64
&lt;/span&gt;        &lt;span class="n"&gt;buffered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PNG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;b64encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buffered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getvalue&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;api_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="c1"&gt;# Prepare the request payload
&lt;/span&gt;        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generate a caption for this image in as detail as possible. Don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t send anything else apart from the caption.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="p"&gt;},&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inline_data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mime_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;img_str&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
                    &lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;candidates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exceptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: Unable to generate caption. &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;,)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This is a standalone function which I have written that takes an Image as input, and sends it to Gemini API using the API key. The code is straightforward, we are just doing base64 encoding so image gets sent via API. We instruct Gemini to caption the image in detail using the prompt. The response from API is parsed, and printed in the console and returned as a tuple (required by ComfyUI).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;NODE_CLASS_MAPPINGS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ImageCaptioningNode&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This dictionary maps the class name to the class itself, which is used by ComfyUI to register the custom node.&lt;/p&gt;

&lt;p&gt;To conclude your article on creating a custom ComfyUI node, you can summarize the key points and provide some final thoughts. Here's a suggested conclusion:&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;Creating custom nodes for ComfyUI opens up a world of possibilities for extending and enhancing your image generation workflows. In this article, we've walked through the process of building a custom image captioning node, demonstrating how to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Define input and output types&lt;/li&gt;
&lt;li&gt; Integrate with external APIs (in this case, the Gemini API for image captioning)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By following these steps, you can create your own custom nodes to add virtually any functionality you need to ComfyUI. Whether you're integrating new LLM models, adding specialized image processing techniques, or creating shortcuts for common tasks, custom nodes allow you to tailor ComfyUI to your specific requirements.&lt;/p&gt;

&lt;p&gt;Remember that while we've focused on image captioning in this example, the same principles can be applied to create nodes for a wide variety of tasks. The key is to understand the structure of a ComfyUI node and how to interface with the expected inputs and outputs.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>comfyui</category>
      <category>stablediffusion</category>
      <category>genai</category>
      <category>python</category>
    </item>
    <item>
      <title>Set up your own personal browser in the Cloud</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Sun, 17 Mar 2024 08:22:35 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/set-up-your-own-personal-browser-in-the-cloud-2841</link>
      <guid>https://dev.to/dhanushreddy29/set-up-your-own-personal-browser-in-the-cloud-2841</guid>
      <description>&lt;p&gt;In the age of internet surveillance and data breaches, maintaining privacy and security online has never been more critical. Thankfully, with modern cloud technologies and a bit of ingenuity, creating a private and secure browsing environment is not only possible but also quite straightforward. This article guides you through setting up your own personal browser in the cloud, leveraging a Docker container with Firefox installed on a fly.io virtual machine (VM). The process is surprisingly simple and offers numerous benefits, including enhanced security, privacy, and high-speed internet access, no matter where you are in the world.&lt;/p&gt;

&lt;p&gt;I don't want to waste your time, just clone the following &lt;a href="https://github.com/dhanushreddy291/flyio-browser" rel="noopener noreferrer"&gt;repo&lt;/a&gt;. Its a very simple config which sets up a docker daemon on fly VM. Offcourse you can run firefox as a standalone container image on VM, but good luck deploying it. I had a lot of errors related to &lt;code&gt;s6-overlay-suexec&lt;/code&gt;. So what I am doing now, is just deploying this firefox container on a VM which has docker installed as a simple workaround/hack.&lt;/p&gt;

&lt;p&gt;Offcourse I am not building the image for firefox, I will be just using the &lt;a href="https://docs.linuxserver.io/images/docker-firefox" rel="noopener noreferrer"&gt;kasm firefox image&lt;/a&gt; from &lt;a href="https://linuxserver.io" rel="noopener noreferrer"&gt;linuxserver.io&lt;/a&gt; instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo of what we are building
&lt;/h2&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-1769008521142493360-737" src="https://platform.twitter.com/embed/Tweet.html?id=1769008521142493360"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1769008521142493360-737');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1769008521142493360&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://fly.io/" rel="noopener noreferrer"&gt;Fly.io&lt;/a&gt; is a platform that helps you run your apps and databases closer to your users all around the world. It takes your app code, packages it up neatly, and puts it on virtual machines that can be quickly started or stopped. This makes your app faster for users and more reliable. Fly.io is easy to use, works well for small projects or personal apps. It's a great way to make sure your app runs smoothly for people no matter where they are.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros of using a cloud browser
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Private and Secure&lt;/strong&gt;: Your browsing environment is isolated, reducing the risk of malware and eavesdropping.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;High-Speed Internet&lt;/strong&gt;: Utilize the high bandwidth of remote servers for faster browsing, especially beneficial for bandwidth-intensive tasks.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Easy Deployment&lt;/strong&gt;: With my guide, deploying your cloud browser is a breeze.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Persistent Data&lt;/strong&gt;: By mounting a volume to the VM, your browser history and data remain intact between sessions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deployment Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Configure the &lt;code&gt;fly.toml&lt;/code&gt; File&lt;/strong&gt;: Begin by cloning the provided &lt;a href="https://github.com/dhanushreddy291/flyio-browser" rel="noopener noreferrer"&gt;repository&lt;/a&gt;. Edit the &lt;code&gt;fly.toml&lt;/code&gt; file to change the app name to something unique and select your preferred region for deployment to minimize latency. I used &lt;code&gt;sin&lt;/code&gt; for Singapore, but Fly.io offers a wide range of &lt;a href="https://fly.io/docs/reference/regions" rel="noopener noreferrer"&gt;regions&lt;/a&gt; you can choose from. Chose whichever thats closest to you to decrease the latency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set Up Authentication&lt;/strong&gt;: Modify the default username and password in the &lt;code&gt;deploy.sh&lt;/code&gt; script to ensure your browser is secure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Launch Your Cloud Browser&lt;/strong&gt;: Run the &lt;code&gt;deploy.sh&lt;/code&gt; script. This script will install &lt;code&gt;flyctl&lt;/code&gt; (if not already installed), authenticate your Fly.io account, and handle the deployment of your Firefox container to your selected region. Follow the on-screen prompts during deployment.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Accessing Your Cloud Browser
&lt;/h3&gt;

&lt;p&gt;Once deployment is complete, Fly.io will provide a URL for accessing your cloud browser. Navigate to this URL and log in with the username and password you set earlier. Just like that, you're ready to enjoy a secure, private browsing experience from anywhere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Managing Costs
&lt;/h3&gt;

&lt;p&gt;Running a 2 GB instance on Fly.io costs approximately $0.01476 per hour or around $10.7 per month. To manage expenses, you can stop the VM when it's not in use, as Fly.io bills to the second, allowing for precise control over your spending.&lt;/p&gt;

&lt;h3&gt;
  
  
  Advanced Features
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Custom Domain
&lt;/h4&gt;

&lt;p&gt;For a more personalized browsing experience, Fly.io supports setting up custom domains for your apps. Follow the &lt;a href="https://fly.io/docs/reference/regions" rel="noopener noreferrer"&gt;custom domain guide&lt;/a&gt; for detailed instructions.&lt;/p&gt;

&lt;h4&gt;
  
  
  Automation with Telegram Bot
&lt;/h4&gt;

&lt;p&gt;Included in the repository is a bonus feature: a Telegram bot for automating the start and stop of your VM, helping you manage usage and costs effectively. Setting up the bot takes just a few minutes, and you can control your cloud browser directly from Telegram, ensuring it's running only when you need it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wrapping Up
&lt;/h3&gt;

&lt;p&gt;What we have setup is a DIY version of &lt;a href="https://neverinstall.com/" rel="noopener noreferrer"&gt;neverinstall.com&lt;/a&gt;, which allows running applications on remote machines. Now that you have set up a browser for yourself, you can deploy any apps you can imagine, just do a simple Google Search for KASM [NAME_OF_APP_YOU_WANT], suppose you want to deploy Ubuntu Desktop, you can search on Google "&lt;a href="https://hub.docker.com/r/kasmweb/desktop" rel="noopener noreferrer"&gt;Kasm Ubuntu Docker image&lt;/a&gt;" and deploy that image to get a full fledged desktop in the cloud.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>cloud</category>
      <category>docker</category>
      <category>browser</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Finetune Llama 2: A Beginner's Guide</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Mon, 11 Sep 2023 17:12:42 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/how-to-finetune-llama-2-a-beginners-guide-219e</link>
      <guid>https://dev.to/dhanushreddy29/how-to-finetune-llama-2-a-beginners-guide-219e</guid>
      <description>&lt;p&gt;Meta AI's LLaMA 2 has taken the NLP community by storm with its impressive range of pretrained and fine-tuned Large Language Models (LLMs). With model sizes ranging from 7B to a staggering 70B parameters, LLaMA 2 builds upon the success of its predecessor, LLaMA 1, offering a host of enhancements that have captivated the NLP community.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evolution of LLaMA
&lt;/h2&gt;

&lt;p&gt;LLaMA 2 signifies a significant evolution in the landscape of language models. Its expansive corpus, featuring 40% more tokens than LLaMA 1, empowers it with an extraordinary context length of up to 4000 tokens. This extended contextual understanding enables LLaMA 2 to excel in tasks that require nuanced comprehension of text and context.&lt;/p&gt;

&lt;p&gt;What makes LLaMA 2 even more extraordinary is its accessibility. Meta AI has generously made these advanced model weights available for both research and commercial applications. This democratization of cutting-edge language models ensures that a broader audience, from researchers to businesses, can harness the power of LLaMA 2 for their unique needs.&lt;/p&gt;

&lt;p&gt;To get access to Llama 2, you can follow these steps:&lt;/p&gt;

&lt;p&gt;Go to the Hugging Face Model Hub: &lt;a href="https://huggingface.co/meta-llama" rel="noopener noreferrer"&gt;huggingface.co/meta-llama&lt;/a&gt; and select the model that you want to use.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click on the "&lt;strong&gt;Request Access&lt;/strong&gt;" button.&lt;/li&gt;
&lt;li&gt;Fill out the form and Submit it.&lt;/li&gt;
&lt;li&gt;Once your request has been approved, you will be able to download the model weights.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Here are some additional details about each size of the Llama 2 model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;7B parameters&lt;/strong&gt;: This is the smallest size of the Llama 2 model. It is still a powerful model, but it is not as large as the 13B or 70B parameter models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;13B parameters&lt;/strong&gt;: This is the medium-sized version of the Llama 2 model. It is a good choice for most applications.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;70B parameters&lt;/strong&gt;: This is the largest size of the Llama 2 model. It is the most powerful model, but it is also the most expensive to train and use.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;In this blog post, I will show you how to effortlessly fine-tune the LLaMA 2 - 7B model on a subset of the CodeAlpaca-20k dataset. This dataset contains over 20,000 coding questions and their corresponding correct answers. By fine-tuning the model on this dataset, we can teach it to generate code for a variety of tasks.&lt;/p&gt;

&lt;p&gt;In this blog post, I want to make it as simple as possible to fine-tune the LLaMA 2 - 7B model, using as little code as possible. We will be using the &lt;a href="https://github.com/tloen/alpaca-lora" rel="noopener noreferrer"&gt;Alpaca Lora Training script&lt;/a&gt;, which automates the process of fine-tuning the model and for GPU we will be using &lt;a href="https://beam.cloud" rel="noopener noreferrer"&gt;Beam&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can create a free account on &lt;a href="https://www.beam.cloud/" rel="noopener noreferrer"&gt;Beam&lt;/a&gt;, to get started.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;An account on &lt;a href="https://www.beam.cloud/login" rel="noopener noreferrer"&gt;Beam&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;An API Key from &lt;a href="https://www.beam.cloud/dashboard/settings/api-keys" rel="noopener noreferrer"&gt;Dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Install &lt;a href="https://docs.beam.cloud/getting-started/quickstart" rel="noopener noreferrer"&gt;Beam CLI&lt;/a&gt; by running:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://raw.githubusercontent.com/slai-labs/get-beam/main/get-beam.sh &lt;span class="nt"&gt;-sSfL&lt;/span&gt; | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Configure Beam by entering
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;beam configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Install Beam SDK
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;beam-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you’re ready to start using Beam to deploy your ML models.&lt;/p&gt;

&lt;p&gt;To make it simple, I have made a &lt;a href="https://github.com/dhanushreddy291/finetune-llama2" rel="noopener noreferrer"&gt;Github Repo&lt;/a&gt;, which you can clone to start with.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;app.py&lt;/code&gt; file, I use the &lt;a href="https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k" rel="noopener noreferrer"&gt;CodeAlpaca-20k&lt;/a&gt; dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_model&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Trained models will be saved to this path
&lt;/span&gt;    &lt;span class="n"&gt;beam_volume_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./checkpoints&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# Load dataset -- for this example, we'll use the sahil2801/CodeAlpaca-20k dataset hosted on Huggingface:
&lt;/span&gt;    &lt;span class="c1"&gt;# https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k
&lt;/span&gt;    &lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DatasetDict&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sahil2801/CodeAlpaca-20k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train[:20%]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Adjust the training loop based on the size of the dataset
&lt;/span&gt;    &lt;span class="n"&gt;samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;val_set_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;base_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;val_set_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;val_set_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;beam_volume_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run the training/finetuning we will be running it using the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;beam run app.py:train_model
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we run this command, the training function will run on Beam's cloud, and we'll see the progress of the training process streamed to our terminal. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcujy9wivzm8u7xkgoblc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcujy9wivzm8u7xkgoblc.png" alt="LlaMA2 Finetuning logs" width="800" height="236"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The training may take hours to complete depending on the size of the dataset you use for finetuning. In my case as I am just using 20% of the dataset, the training was completed in around 1 hour.&lt;/p&gt;

&lt;p&gt;When the model is succesfuuly trained, we can deploy an API to run inference of our fine-tuned model.&lt;/p&gt;

&lt;p&gt;Let's create a new function for inference. If you look closely, you'll notice that we're using a different decorator this time: &lt;code&gt;rest_api&lt;/code&gt; instead of &lt;code&gt;run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This will allow us to deploy the function as a REST API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.rest_api&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_inference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Inputs passed to the API
&lt;/span&gt;    &lt;span class="nb"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Grab the latest checkpoint
&lt;/span&gt;    &lt;span class="n"&gt;checkpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_newest_checkpoint&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Initialize models
&lt;/span&gt;    &lt;span class="n"&gt;models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_models&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokenizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;prompter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate text
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompter&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can deploy this as a REST API by running this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;beam deploy app.py:run_inference
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c0bp9a47tvb5gfyhr0i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c0bp9a47tvb5gfyhr0i.png" alt="Deployment Logs" width="800" height="179"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we navigate to the URL printed in the shell, we'll be able to copy the full cURL request to call the REST API.&lt;/p&gt;

&lt;p&gt;Now when I tried asking "How to download an image from link in Python"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"How to download an image from its URL in Python?"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I get this entire markdown string as a response&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgk5r0ru14ppec77gpbu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgk5r0ru14ppec77gpbu.png" alt="API Response" width="800" height="416"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have prettified it below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://upload.wikimedia.org/wikipedia/commons/thumb/6/63/Square_logo_2008.png/1200px-Square_logo_2008.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;### Solution:
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;

&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://upload.wikimedia.org/wikipedia/commons/thumb/6/63/Square_logo_2008.png/1200px-Square_logo_2008.png&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
 &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;### Explanation:
&lt;/span&gt;
&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="sb"&gt;`urllib.request`&lt;/span&gt; &lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;second&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`with`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`with`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;used&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;third&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`urlopen`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`urlopen`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;used&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;fourth&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`response`&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`response`&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;used&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;fifth&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`read`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`read`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;used&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;read&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;sixth&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="sb"&gt;`print`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;used&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="c1"&gt;### Reflection:
&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;difference&lt;/span&gt; &lt;span class="n"&gt;between&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`with`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`try`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;difference&lt;/span&gt; &lt;span class="n"&gt;between&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`urlopen`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`request`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;difference&lt;/span&gt; &lt;span class="n"&gt;between&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`response`&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`request`&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;difference&lt;/span&gt; &lt;span class="n"&gt;between&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`read`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`request`&lt;/span&gt; &lt;span class="n"&gt;function&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;difference&lt;/span&gt; &lt;span class="n"&gt;between&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`print`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="sb"&gt;`request`&lt;/span&gt; &lt;span class="n"&gt;statement&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Although training the model on 20% of the dataset is not ideal, it is still a good way to get started. You can see that we are already starting to see good results with this small amount of data. If you want to get even better results, you can try fine-tuning the model on the entire dataset. This will take several hours, but it will be worth it in the end. Once the model is trained, you can use it whenever you need it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, we have seen how to fine-tune LLaMA 2 - 7B on a subset of the CodeAlpaca-20k dataset using the Alpaca Lora Training script. This script makes it easy to fine-tune the model without having to write any code.&lt;/p&gt;

&lt;p&gt;We have also seen that even by training the model on 20% of the dataset, we can get good results. If you want to get even better results, you can try fine-tuning the model on the entire dataset.&lt;/p&gt;

&lt;p&gt;The future of open source AI is bright. The availability of large language models like LLaMA 2 makes it possible for anyone to develop powerful AI applications. With the help of open source tools and resources, developers can fine-tune these models to meet their specific needs.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Deploy Hugging Face Models on Serverless GPU</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Sun, 28 May 2023 10:58:31 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/deploy-hugging-face-models-on-serverless-gpu-47am</link>
      <guid>https://dev.to/dhanushreddy29/deploy-hugging-face-models-on-serverless-gpu-47am</guid>
      <description>&lt;p&gt;&lt;strong&gt;Hugging Face&lt;/strong&gt; is a platform and community that focuses on making artificial intelligence and data science more accessible. It aims to make AI knowledge and resources more widely available by promoting open source contributions. As AI technologies become more widely used, it’s important for advancements to be made in the field. Hugging Face provides a space for AI and data science professionals to connect and share their work.&lt;/p&gt;

&lt;p&gt;Hugging Face provides state of the art machine learning models for different tasks. It has a vast number of pre-trained models in categories such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Computer Vision (Image Segmentation, Image Classification, Image Generation etc)&lt;/li&gt;
&lt;li&gt;Natural Language Processing (Text Classification, Summarization, Generation, Translation etc)&lt;/li&gt;
&lt;li&gt;Audio (Speech Recognition, Text to Speech etc)&lt;/li&gt;
&lt;li&gt;and much &lt;a href="https://huggingface.co/models" rel="noopener noreferrer"&gt;more&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hugging Face's models are easy to use. For example, to do a text classification task, you can use the &lt;a href="https://huggingface.co/docs/transformers/index" rel="noopener noreferrer"&gt;transformers&lt;/a&gt; library (which is part of Hugging Face), to load a pre-trained model and then use it to classify a text.&lt;/p&gt;

&lt;p&gt;An example for text classification using &lt;code&gt;transformers&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pipeline&lt;/span&gt;

&lt;span class="n"&gt;sentiment_analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sentiment-analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;siebert/sentiment-roberta-large-english&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sentiment_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I like Transformers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# [{'label': 'POSITIVE', 'score': 0.9987214207649231}]
&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sentiment_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I hate React&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c1"&gt;# [{'label': 'NEGATIVE', 'score': 0.9993581175804138}]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see how easy it is to do a text classification task in just 3 lines of code using the transformers library.&lt;/p&gt;

&lt;p&gt;Similarly, in the same way we find many models on Hugging Face with their corresponding model card. &lt;/p&gt;

&lt;p&gt;One of the challenges of using Hugging Face models is that they can be computationally expensive to deploy as most of them require GPU's. This is because they are often large and complex, and require a lot of computing power to run. GPUs can be very expensive. For example, the hourly rate of an NVIDIA A10G instance on &lt;strong&gt;AWS&lt;/strong&gt; is $1.30 for a 24 GB memory. This equates to about $2496 for a period of one month. This show how you can easily &lt;em&gt;burn money on AWS&lt;/em&gt; 😅, for an ML model which hardly gets 1000-2000 requests in a month.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp95iy92452vjm9rrfxc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp95iy92452vjm9rrfxc.png" alt="An Image saying that there is a solution" width="476" height="313"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Enter Serverless GPU's
&lt;/h1&gt;

&lt;p&gt;Serverless GPUs are a type of cloud computing service that provides access to powerful GPUs on demand. This means that you only pay for the time that you use the GPUs, which can save you a significant amount of money if you only need to use them occasionally.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pay-as-You-Go Pricing&lt;/strong&gt;: Any Serverless architectures follow a pay-as-you-go pricing model, so you only pay for the actual resource consumption during model inference.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ease of use&lt;/strong&gt;: Serverless GPUs are very easy to use. You don't need to worry about managing or maintaining the hardware, which can save you a lot of time and hassle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Serverless GPUs can be scaled up or down as needed, which makes them ideal for applications that have fluctuating workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhgoogrtdeylezl67hsi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhgoogrtdeylezl67hsi.png" alt="A meme showing using servers vs serverless" width="705" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are a number of serverless GPU providers out there, such as &lt;a href="https://www.banana.dev/" rel="noopener noreferrer"&gt;Banana&lt;/a&gt;, &lt;a href="https://replicate.com/" rel="noopener noreferrer"&gt;Replicate&lt;/a&gt;, &lt;a href="https://www.beam.cloud/" rel="noopener noreferrer"&gt;Beam&lt;/a&gt;, &lt;a href="https://modal.com/" rel="noopener noreferrer"&gt;Modal&lt;/a&gt; and many more.&lt;/p&gt;

&lt;p&gt;I would recommend checking out all the websites, before deploying your application. All of the ones which I have mentioned do have a free limit.&lt;/p&gt;

&lt;p&gt;Going to the previous calculation of the costs, for a 16 CPU model, with a 32 GB RAM on an A10G instance the price on Beam was just $0.00155419/second (Almost all of the providers do have an identical pricing). So let's say your &lt;strong&gt;API&lt;/strong&gt; gets 1500 requests in a month and lets say an average inference time to be 1 min (60 seconds). So the cost incurred are: 0.00155419 * 60 * 1500 = $140. Almost an 18x reduction😎 in cost as compared to using AWS.&lt;/p&gt;

&lt;p&gt;For this tutorial, I am going to use &lt;a href="https://www.beam.cloud/" rel="noopener noreferrer"&gt;Beam&lt;/a&gt; to deploy &lt;a href="https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html" rel="noopener noreferrer"&gt;dolly-v2-7b&lt;/a&gt;, an open source large language model from &lt;a href="https://www.databricks.com/" rel="noopener noreferrer"&gt;Databricks&lt;/a&gt;, which responds similar to &lt;a href="https://openai.com/blog/chatgpt" rel="noopener noreferrer"&gt;ChatGPT&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Of Course feel free to use any ML model with your choice of serverless GPU provider.&lt;/p&gt;

&lt;h1&gt;
  
  
  Deployment
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;An account on &lt;a href="https://www.beam.cloud/login" rel="noopener noreferrer"&gt;Beam&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;An API Key from &lt;a href="https://www.beam.cloud/dashboard/settings/api-keys" rel="noopener noreferrer"&gt;Dashboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Install &lt;a href="https://docs.beam.cloud/getting-started/quickstart" rel="noopener noreferrer"&gt;Beam CLI&lt;/a&gt; by running:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://raw.githubusercontent.com/slai-labs/get-beam/main/get-beam.sh &lt;span class="nt"&gt;-sSfL&lt;/span&gt; | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Configure Beam by entering
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;beam configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Install Beam SDK
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;beam-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you’re ready to start using Beam to deploy your ML models.&lt;/p&gt;

&lt;p&gt;As I said I will be deploying &lt;a href="https://huggingface.co/databricks/dolly-v2-7b" rel="noopener noreferrer"&gt;dolly-v2-7b&lt;/a&gt; from Databricks. The code to run it is provided on Hugging Face. So copying it into a file named as &lt;code&gt;run.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;instruct_pipeline&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InstructionTextGenerationPipeline&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-7b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding_side&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;left&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-7b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device_map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;torch_dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bfloat16&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;generate_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InstructionTextGenerationPipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is one major problem with this code. In the instantiation of &lt;code&gt;AutoModelForCausalLM&lt;/code&gt; and &lt;code&gt;AutoTokenizer&lt;/code&gt; a cache directory is not used. So this will lead to download of the model files every time it gets invoked. Of Course this will not happen if you run the code on your own device because the cache directory by default will be somewhere inside your hard disk. But in the serverless world, everything is stateless, the model files downloaded once will not persist in the consecutive runs.&lt;/p&gt;

&lt;p&gt;Digging through beam docs, we see an option of &lt;a href="https://docs.beam.cloud/data/shared-volumes" rel="noopener noreferrer"&gt;Shared Volumes&lt;/a&gt;, which means now model files can be persisted between consecutive runs and the it can be mounted in the same way as a normal hard disk whenever the model runs.&lt;/p&gt;

&lt;p&gt;So for now lets pass a cache_dir as "./mpt_weights". The modified code is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;instruct_pipeline&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InstructionTextGenerationPipeline&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-7b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding_side&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;left&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cache_path&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-7b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;device_map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;torch_dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bfloat16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cache_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;generate_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InstructionTextGenerationPipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To define the configuration of the system where our ML model runs, we need to write a &lt;a href="https://docs.beam.cloud/getting-started/quickstart" rel="noopener noreferrer"&gt;Beam app definition&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I have mostly copied this code from Beam Docs. Creating a new file named &lt;code&gt;app.py&lt;/code&gt; and adding code as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;beam&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;beam&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;App&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mpt-7b-chat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;32Gi&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;gpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A10G&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;python_packages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accelerate&amp;gt;=0.16.0,&amp;lt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transformers[torch]&amp;gt;=4.28.1,&amp;lt;5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;torch&amp;gt;=1.13.1,&amp;lt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Trigger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;RestAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;beam&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;()},&lt;/span&gt;
    &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;beam&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;()},&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run.py:generate_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;keep_warm_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mount&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SharedVolume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mpt_weights&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./mpt_weights&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above code creates a Beam application that uses 16 CPUs, 32GB of memory, and an A10G GPU. It also defines all the pip packages that would be required during the model run. The application is triggered by a REST API call and generates text in response to a prompt. The application is also loaded with models when it starts.&lt;/p&gt;

&lt;p&gt;Now let's modify &lt;code&gt;run.py&lt;/code&gt; one last time before we deploy it. We need to return the response from the &lt;code&gt;generate_text&lt;/code&gt; function which we will create in &lt;code&gt;run.py&lt;/code&gt; so that it can be returned back to the REST API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;instruct_pipeline&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InstructionTextGenerationPipeline&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;cache_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./mpt_weights&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;padding_side&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;left&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cache_path&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;databricks/dolly-v2-3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;device_map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;torch_dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bfloat16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;cache_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cache_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;generate_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InstructionTextGenerationPipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;generated_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please don't get intimidated by this code. I haven't modified much from the one on &lt;a href="https://huggingface.co/databricks/dolly-v2-7b#:~:text=model%20and%20tokenizer%3A-,import%20torch,-from%20instruct_pipeline%20import" rel="noopener noreferrer"&gt;Hugging Face&lt;/a&gt;. I would highly suggest you first refer to respective &lt;a href="https://docs.beam.cloud/introduction" rel="noopener noreferrer"&gt;docs&lt;/a&gt; before deploying anything on cloud that a random dude suggests on the internet.&lt;/p&gt;

&lt;p&gt;Also one more thing we need &lt;code&gt;instruct_pipeline.py&lt;/code&gt; for dolly-v2-3b as mentioned on Hugging Face. So copy paste the code from &lt;a href="https://huggingface.co/databricks/dolly-v2-3b/blob/main/instruct_pipeline.py" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So now we have 3 files, namely &lt;code&gt;app.py&lt;/code&gt;, &lt;code&gt;run.py&lt;/code&gt; and &lt;code&gt;instruct_pipeline.py&lt;/code&gt; (may not be the same in your case).&lt;/p&gt;

&lt;p&gt;Now deploy your application by entering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;beam deploy app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will deploy your ML model as serverless REST API which you can use from your Frontend. Obviosly you can deploy this model as an asynchronous &lt;a href="https://docs.beam.cloud/triggers/webhook" rel="noopener noreferrer"&gt;webhook&lt;/a&gt; instead of a REST API if your model inference takes a long time.&lt;/p&gt;

&lt;p&gt;You can get the &lt;code&gt;CURL&lt;/code&gt; command to call your app from the Beam app dashboard. Testing the app I deployed, &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4oqn877dxi3w1jzf7rf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv4oqn877dxi3w1jzf7rf.png" alt="An Image showing ML inference on Serverless GPU" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I got a response in under 8 seconds which is not bad for an Open source version of ChatGPT :)&lt;/p&gt;

&lt;h1&gt;
  
  
  Cons
&lt;/h1&gt;

&lt;p&gt;Everything seems good till now with serverless but let's weigh in on the cons.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cold starts&lt;/strong&gt;: When a serverless GPU is not in use, it is shut down. This means that the first time you use it, it will need to be booted up, which can take a few seconds to minutes depending on your model size. This can be a problem if you are running a real-time application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;: Serverless GPUs can be more expensive than traditional GPU-based solutions, especially if you are running a long-running application that almost stays active throughout the day.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;In conclusion, we have seen how to deploy Hugging Face models on serverless GPUs. This can be a great way to get the performance benefits of GPUs without having to worry about the underlying hardware, when not in use.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>huggingface</category>
      <category>serverless</category>
      <category>gpu</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Effortless Documentation of your Python Code with Github Actions and GPT3</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Thu, 04 May 2023 13:23:24 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/effortless-documentation-of-your-python-code-with-github-actions-and-gpt3-a27</link>
      <guid>https://dev.to/dhanushreddy29/effortless-documentation-of-your-python-code-with-github-actions-and-gpt3-a27</guid>
      <description>&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;I built a &lt;strong&gt;Github Actions&lt;/strong&gt; workflow that &lt;strong&gt;automates&lt;/strong&gt; the process of adding &lt;strong&gt;docstrings to Python functions&lt;/strong&gt; using &lt;strong&gt;GPT-3&lt;/strong&gt;. The workflow loops over all &lt;code&gt;.py&lt;/code&gt; files and functions in each one of them, sends the function code to the GPT-3 API for analysis, and inserts the suggested docstring for the function if it does not already have one.&lt;/p&gt;

&lt;p&gt;This solution streamlines the process of documenting Python code and saves developers time and effort by automating the task of adding docstrings to functions. By using Github Actions and GPT-3, my solution helps developers to focus on other aspects of their work while maintaining high-quality code documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Category Submission:
&lt;/h3&gt;

&lt;p&gt;Maintainer Must-Haves&lt;/p&gt;

&lt;h3&gt;
  
  
  App Link
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/dhanushreddy291/docstring-generator" rel="noopener noreferrer"&gt;github.com/dhanushreddy291/docstring-generator&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Screenshots
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiujaq0r56xvq9mw7emee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiujaq0r56xvq9mw7emee.png" alt="An Image showing the demo of Github Action" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fycm7amwzu0q97x0eooym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fycm7amwzu0q97x0eooym.png" alt="Another Image showing the demo of Github Action" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Description
&lt;/h3&gt;

&lt;p&gt;My project is a Github Actions workflow that automates the process of adding docstrings to Python functions using GPT-3. The solution is designed to streamline the process of documenting Python code by automatically generating docstrings for functions that do not have one.&lt;/p&gt;

&lt;p&gt;The workflow is triggered whenever changes are pushed to the repository, and it loops over all &lt;code&gt;.py&lt;/code&gt; files to find functions without a docstring. For each such function, the code is sent to the GPT-3 API to analyze and suggest a corresponding docstring. The functions having a docstring are ignored during the workflow. Finally the modified code is automatically commited to the Github Repo.&lt;/p&gt;

&lt;p&gt;By automating the task of adding docstrings to Python functions, my project saves developers time and effort, allowing them to focus on other aspects of their work. The solution also helps maintain high-quality code documentation, which is essential for the long-term maintainability and scalability of any software project.&lt;/p&gt;

&lt;p&gt;Of course, I am just scratching the surface with what is possible with Github Actions and GPT3. One can use GPT4-32K Token Model API to even build a full Markdown powered documentation by analyzing the code. At the moment, my workflow works only for Python but can be easily extended to other languages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Link to Source Code
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/dhanushreddy291/docstring-generator" rel="noopener noreferrer"&gt;github.com/dhanushreddy291/docstring-generator&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Permissive License
&lt;/h3&gt;

&lt;p&gt;MIT&lt;/p&gt;

&lt;h2&gt;
  
  
  Background (What made you decide to build this particular app? What inspired you?)
&lt;/h2&gt;

&lt;p&gt;As a software developer, I have often found the process of writing code documentation to be time-consuming. I knew that there had to be a better way to document code that did not require manual effort and could produce consistent and reliable results.&lt;/p&gt;

&lt;p&gt;That's when I came across GPT-3, a powerful language model developed by OpenAI, which has been trained on vast amounts of text and code data, that can generate humanlike responses to natural language inputs. I realized that it could be leveraged to automate the process of adding docstrings to Python functions. Because why not? If &lt;strong&gt;Github Copilot&lt;/strong&gt; is being used to write &lt;a href="https://github.blog/2023-03-22-github-copilot-x-the-ai-powered-developer-experience/#:~:text=Many%20developers%20and%20companies%20have%20already%20used%20GitHub%20Copilot%2C%20and%20it%E2%80%99s%20helping%20improve%20productivity%20and%20happiness." rel="noopener noreferrer"&gt;46% of code&lt;/a&gt;, then why not do the same for code documentation.&lt;/p&gt;

&lt;p&gt;That's why I decided to build a Github Actions workflow that uses GPT-3 to generate docstrings for Python functions automatically. My goal was to create a solution that would save developers time and effort, while also improving the quality of code documentation.&lt;/p&gt;

&lt;p&gt;Overall, I was inspired by the idea of using GPT3 to automate a task that had traditionally been done manually. I believe that my project has the potential to transform the way that developers approach code documentation, making it easier, faster, and more reliable than ever before. Offcourse the documentation will keep on improving once we use better LLM's such as GPT4 and beyond.&lt;/p&gt;

&lt;h3&gt;
  
  
  How I built it (How did you utilize GitHub Actions or GitHub Codespaces? Did you learn something new along the way? Pick up a new skill?)
&lt;/h3&gt;

&lt;p&gt;For this project, I utilized GitHub Actions to automate the process of adding docstrings to Python files in my repository. The workflow is triggered on a push event to the main branch of the repository.&lt;/p&gt;

&lt;p&gt;It has the following steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Check out repository&lt;/strong&gt; step is used to checkout the latest version of the repository.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up Python and install dependencies&lt;/strong&gt; step is used to set up the Python environment with version 3.10 and installed the dependencies required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run add_docstring script&lt;/strong&gt; step is used to execute the add_docstring.py script with the path to the file that needs to be updated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check for changes&lt;/strong&gt; step is used to check if any changes have been made to the repository. If there are changes, it sets the output has_changes to true.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit and push changes&lt;/strong&gt; step is used to commit and push the changes back to the main branch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;add_docstring.py&lt;/code&gt; script utilizes the OpenAI GPT-3 language model to generate docstrings for the Python functions in the repository. The script is also responsible for formatting the Python files using the black and autoflake libraries.&lt;/p&gt;

&lt;p&gt;During this project, I learned how to use GitHub Actions and integrate them into my workflow. I also learned how to use the &lt;strong&gt;RedBaron&lt;/strong&gt; library to parse and manipulate Python code.&lt;/p&gt;

&lt;p&gt;I do admit I faced a few challenges. The main problem was working with the OpenAI API. To avoid rate limiting, I had to add a time delay of 20 seconds after every request (as free trial accounts have a hard cap of 3 req/min). This was necessary to ensure that the API would not block the requests, resulting in failed workflow.&lt;/p&gt;

&lt;p&gt;Another challenge was integrating the script with GitHub Actions. I had to learn how to create a workflow file and configure the necessary settings to ensure that the script would run automatically whenever changes were pushed to the main branch. I also had to debug some issues related to permissions for commiting back to the repo again.&lt;/p&gt;

&lt;p&gt;Overall, the project was a great learning experience that helped me improve my skills in working with APIs and automating tasks using GitHub Actions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Resources/Info
&lt;/h3&gt;

&lt;p&gt;You can find the entire source code used &lt;a href="https://github.com/dhanushreddy291/docstring-generator" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

</description>
      <category>githubhack23</category>
      <category>githubactions</category>
      <category>gpt3</category>
      <category>python</category>
    </item>
    <item>
      <title>Fine-Tune GPT-3 on custom datasets with just 10 lines of code using GPT-Index</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Sun, 12 Feb 2023 13:52:48 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/fine-tune-gpt-3-on-custom-dataset-with-just-10-lines-of-code-using-gpt-index-18mc</link>
      <guid>https://dev.to/dhanushreddy29/fine-tune-gpt-3-on-custom-dataset-with-just-10-lines-of-code-using-gpt-index-18mc</guid>
      <description>&lt;p&gt;The Generative Pre-trained Transformer 3 (GPT-3) model by OpenAI is a state-of-the-art language model that has been trained on a massive amount of text data. GPT3 is capable of generating human-like text, performing tasks like question-answering, summarization, and even writing creative fiction. Wouldn't it be cool if you feed GPT3 with your own data source and ask it questions.&lt;/p&gt;

&lt;p&gt;In this blog post, we'll be going to see exactly that. Fine-tuning GPT-3 on custom datasets using the GPT-Index, and do it all with just 10 lines of code! GPT-Index does the heavy lifting, by providing an high level API for connecting external knowledge bases with LLMs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisties
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You need to have Python Installed on your system.&lt;/li&gt;
&lt;li&gt;An OpenAI API Key. If you donot have a key create a new account on &lt;a href="https://openai.com/api/" rel="noopener noreferrer"&gt;openai.com/api&lt;/a&gt;, and get $18 of free credits.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;I am not going into the details of how all this is working, as this would make this blog post longer and go against the title. You can refer to &lt;a href="https://gpt-index.readthedocs.io/en/latest/index.html" rel="noopener noreferrer"&gt;gpt-index.readthedocs.io/en/latest&lt;/a&gt; if you need to learn more.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Create a folder and open up it in your favorite code editor. Create a &lt;a href="https://python.land/virtual-environments/virtualenv" rel="noopener noreferrer"&gt;virtual environment&lt;/a&gt; for this project if needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For this tutorial, we need to have &lt;a href="https://pypi.org/project/gpt-index/" rel="noopener noreferrer"&gt;gpt-index&lt;/a&gt; and &lt;a href="https://python.langchain.com/en/latest/index.html" rel="noopener noreferrer"&gt;Langchain&lt;/a&gt; installed. Please download the versions i mention here so to avoid any breaking changes.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;gpt-index&lt;span class="o"&gt;==&lt;/span&gt;0.4.1 &lt;span class="nv"&gt;langchain&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;0.0.83
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your data sources are in form of PDF's also install &lt;code&gt;PyPDF2&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;&lt;span class="nv"&gt;PyPDF2&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;3.0.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now create a new file &lt;code&gt;main.py&lt;/code&gt; and add the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;YOUR_OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gpt_index&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GPTSimpleVectorIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SimpleDirectoryReader&lt;/span&gt;
&lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SimpleDirectoryReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GPTSimpleVectorIndex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# save to disk
&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_to_disk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;index.json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For this code to run, you need to have your datasources be it PDF's, text files etc inside of a directory named as &lt;strong&gt;&lt;em&gt;data&lt;/em&gt;&lt;/strong&gt; in the same folder. Run the code after adding data.&lt;/p&gt;

&lt;p&gt;Your project directory should look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;project/
├─ data/
│  ├─ data1.pdf
├─ query.py
├─ main.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Now create another file named &lt;code&gt;query.py&lt;/code&gt; and add the following code:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;YOUR_OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gpt_index&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GPTSimpleVectorIndex&lt;/span&gt;

&lt;span class="c1"&gt;# load from disk
&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GPTSimpleVectorIndex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_from_disk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;index.json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Any Query You have in your datasets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run this code you will be getting response from OpenAI with the query you have sent.&lt;/p&gt;

&lt;p&gt;I have tried using this paper on &lt;a href="https://arxiv.org/abs/2206.00225v1" rel="noopener noreferrer"&gt;Arxiv&lt;/a&gt;, as a datasource and asked for this query:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkd4haffw9oxe85bwmva.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkd4haffw9oxe85bwmva.png" alt="An Example Query to GPT3 with the coreesponding Response" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With GPT-Index, it has become much easier to work with GPT-3 and fine-tune it with just a few lines of code. I hope this small post has shown you how to get started with GPT-3 on custom datasets using GPT-Index.&lt;/p&gt;

&lt;p&gt;Of course, you can setup a simple frontend to give it a chatbot look like ChatGPT.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>gpt3</category>
      <category>openai</category>
      <category>beginners</category>
    </item>
    <item>
      <title>From Commit to Registry: A Guide to Automate Docker Image Builds with Github Actions</title>
      <dc:creator>Dhanush Reddy</dc:creator>
      <pubDate>Wed, 01 Feb 2023 11:30:27 +0000</pubDate>
      <link>https://dev.to/dhanushreddy29/from-commit-to-registry-a-guide-to-automate-docker-image-builds-with-github-actions-4c6i</link>
      <guid>https://dev.to/dhanushreddy29/from-commit-to-registry-a-guide-to-automate-docker-image-builds-with-github-actions-4c6i</guid>
      <description>&lt;p&gt;&lt;strong&gt;Github Actions&lt;/strong&gt; is a powerful tool that allows developers to automate their software development workflows. It allows triggering actions, such as building and deploying code, in response to events such as a code push or a pull request. You may save time and effort by automating your development process with Github Actions custom workflows.&lt;/p&gt;

&lt;p&gt;Github Actions is integrated directly into Github, making it easy to set up and use. It can be used to automate a wide range of tasks. One of the most popular use cases for Github Actions is automating the build and deployment of Docker images. By using Github Actions, you can automate the process of building and pushing Docker images to a registry, such as Docker Hub. This can save you a significant amount of time and effort, as well as ensuring that your images are built and deployed consistently and correctly.&lt;/p&gt;

&lt;p&gt;In this blog post, we'll go through how to use Github Actions to automate the creation and distribution of Docker images. We will be covering everything from setting up the necessary prerequisites to writing the code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why automate Docker image builds?
&lt;/h2&gt;

&lt;p&gt;That's a good question. Automating the process of building and deploying Docker images can bring several benefits to your development process. Some of the main reasons to automate Docker image builds include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consistency&lt;/strong&gt;: Automating the build process ensures that your images are built consistently and correctly every time. This can help to eliminate errors that may occur due to manual processes, such as typos or missed steps.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Speed&lt;/strong&gt;: Automating the build process can save you a significant amount of time, as the build and deployment process can be done quickly and efficiently. Similarily, multi architecture images can be built in the same workflow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Continuous Integration and Deployment&lt;/strong&gt;: By automating the build process, you can easily integrate the process with the other steps of your development process, such as testing and deploying your code. This can help to ensure that your images are always up-to-date and ready to be deployed to production.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before diving into automating your Docker image builds with Github Actions, there are a few prerequisites that you will need to have in place. These include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/" rel="noopener noreferrer"&gt;A Github account&lt;/a&gt;: In order to use Github Actions, you will need to have a Github account and be able to access the repository where you want to set up the automation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://hub.docker.com/" rel="noopener noreferrer"&gt;A Docker Hub account&lt;/a&gt;: In order to push your images to Docker Hub, you will need to have an account and be logged in.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A Dockerfile: Your repository should contain a Dockerfile that describes the instructions to build your image.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setting up Github Secrets
&lt;/h2&gt;

&lt;p&gt;In order for your Github Actions workflow to push images to your Docker Hub account, you will need to set up secrets for your Docker Hub credentials in your Github repository. This will allow the workflow to securely access your account and perform the necessary actions.&lt;/p&gt;

&lt;p&gt;To set up secrets:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Go to the main page of your Github repository.&lt;/li&gt;
&lt;li&gt; Click on the "Settings" tab.&lt;/li&gt;
&lt;li&gt; Under the "Secrets and Variables" section, click on "Actions" and then on "New Repository Secret"&lt;/li&gt;
&lt;li&gt; Enter a name for the secret, as &lt;code&gt;DOCKERHUB_USERNAME&lt;/code&gt; and enter your Docker Hub username as the value.&lt;/li&gt;
&lt;li&gt; Click "Add secret".&lt;/li&gt;
&lt;li&gt;You can generate an access token for your Docker Hub account on &lt;a href="https://hub.docker.com/settings/security" rel="noopener noreferrer"&gt;hub.docker.com/settings/security&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Repeat this process to add another secret for Dockerhub token. In the code that follows this section, I have kept the name for it as &lt;code&gt;DOCKERHUB_TOKEN&lt;/code&gt;. So ensure you stick to the variable name that you define in your Github settings.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once you've set up your secrets, you can reference them in your Github Actions workflow file using the syntax &lt;code&gt;${{ secrets.SECRET_NAME }}&lt;/code&gt;. This will allow the workflow to access your Docker Hub credentials without exposing them in plaintext in the file.&lt;/p&gt;

&lt;p&gt;The secrets which we just created are encrypted and can only be accessed by GitHub Actions running on the same repository. By setting up secrets for your Docker Hub credentials, you can ensure that your automation process is secure and that your credentials are protected.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Code
&lt;/h2&gt;

&lt;p&gt;At the root of your Github Repo create a folder named as &lt;code&gt;.github&lt;/code&gt; and inside it create another folder named as &lt;code&gt;workflows&lt;/code&gt;. Now create a file named a &lt;code&gt;build-and-push.yaml&lt;/code&gt; or any name as per your needs for writing the Github Action.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build-and-push&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main"&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v3&lt;/span&gt;  &lt;span class="c1"&gt;# This step checkouts the code of the repository&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Login to Docker Hub&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/login-action@v2&lt;/span&gt; &lt;span class="c1"&gt;# This step logs in to the Docker Hub using the secrets&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DOCKERHUB_USERNAME }}&lt;/span&gt;
          &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DOCKERHUB_TOKEN }}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up Docker Buildx&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/setup-buildx-action@v2&lt;/span&gt; &lt;span class="c1"&gt;# This step sets up Buildx, a tool that allows building multi-arch images&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and push&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/build-push-action@v3&lt;/span&gt; &lt;span class="c1"&gt;# This step performs the actual build and push to Docker Hub&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
          &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./Dockerfile&lt;/span&gt;
          &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DOCKERHUB_USERNAME }}/go-server:latest&lt;/span&gt;
          &lt;span class="na"&gt;platforms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;linux/amd64,linux/arm64,linux/arm/v7&lt;/span&gt;
          &lt;span class="c1"&gt;# Cache the image layers&lt;/span&gt;
          &lt;span class="na"&gt;cache-from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type=gha&lt;/span&gt;
          &lt;span class="na"&gt;cache-to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type=gha,mode=max&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code starts by defining the name of the workflow and the on event, which is a push to the main branch.&lt;/p&gt;

&lt;p&gt;Then, there is a jobs block, where we define a job named build. This job will run on an ubuntu-latest machine.&lt;/p&gt;

&lt;p&gt;The steps block defines the steps that the job will perform. In this case, the steps are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check out the code from the repository.&lt;/li&gt;
&lt;li&gt;Login to the Docker Hub using the secrets that we set up earlier.&lt;/li&gt;
&lt;li&gt;Set up Docker Buildx, a tool that allows building multi-architecture images.&lt;/li&gt;
&lt;li&gt;Build and push the image to Docker Hub, using the information from the Dockerfile.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Make sure to change the &lt;code&gt;tags&lt;/code&gt; in the above code with your Docker Hub Repo that you want to create. In my case I have kept is as &lt;code&gt;go-server&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This code provides a simple and effective way to automate the process of building and pushing Docker images to a registry like Docker Hub. With this code, you can ensure that your images are always up-to-date and ready to be deployed.&lt;/p&gt;

&lt;p&gt;The Dockerfile I have used for this Github Action was a simple Golang HTTP Server. You can checkout the entire code &lt;a href="https://github.com/dhanushreddy291/docker-build-github-actions" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, Github Actions is a powerful tool that allows you to automate many processes in your development workflow, including building and pushing Docker images. By automating the process of building and pushing Docker images, you can save a lot of time and effort, and ensure that your images are always up to date.&lt;/p&gt;

&lt;p&gt;In case if you still have any questions regarding this post or want to discuss something with me feel free to connect on &lt;a href="https://www.linkedin.com/in/dhanushreddy29/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or &lt;a href="https://twitter.com/dhanushreddy291" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you run an organization and want me to write for you, please connect with me on my Socials 🙃&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>devops</category>
      <category>beginners</category>
      <category>tutorial</category>
      <category>docker</category>
    </item>
  </channel>
</rss>
