<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rupam Golui</title>
    <description>The latest articles on DEV Community by Rupam Golui (@agasta).</description>
    <link>https://dev.to/agasta</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2861497%2F2a297716-0591-4407-9250-7ea374c4c585.jpg</url>
      <title>DEV Community: Rupam Golui</title>
      <link>https://dev.to/agasta</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/agasta"/>
    <language>en</language>
    <item>
      <title>5 Steps to Deploying an AWS Lambda with EventBridge using Terraform</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Mon, 02 Feb 2026 08:59:01 +0000</pubDate>
      <link>https://dev.to/agasta/5-12-steps-to-deploying-an-aws-lambda-with-eventbridge-using-terraform-5f1c</link>
      <guid>https://dev.to/agasta/5-12-steps-to-deploying-an-aws-lambda-with-eventbridge-using-terraform-5f1c</guid>
      <description>&lt;p&gt;I used to think serverless meant "upload code, magic happens." Then today I tried to deploy a Python Lambda that needed custom dependencies, a schedule trigger, and enough memory to actually stay alive, and quietly questioning my career choices.&lt;/p&gt;

&lt;p&gt;Nevermind, Here's how I finally got my &lt;code&gt;atlas-worker&lt;/code&gt; Lambda running, containerized, scheduled, and only slightly over-provisioned. (You can jump to the last part to get the whole code)&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Accept That ZIP Files Are for Cowards (Use Containers)
&lt;/h2&gt;

&lt;p&gt;Look, you &lt;em&gt;could&lt;/em&gt; package your Lambda as a ZIP. You could also commute to work on a unicycle. Both are technically possible, but why suffer?&lt;/p&gt;

&lt;p&gt;I needed &lt;code&gt;psycopg2&lt;/code&gt;, some geospatial libraries, and enough room to breathe, also zip has a size limit. So I went with a container image. The first hurdle? Terraform needs to log into ECR &lt;em&gt;before&lt;/em&gt; it can push the image, but it needs to know the registry URL to log in... which requires knowing your account ID.&lt;/p&gt;

&lt;p&gt;That's why Use &lt;code&gt;aws_caller_identity&lt;/code&gt; and &lt;code&gt;aws_ecr_authorization_token&lt;/code&gt; data sources. It feels like Terraform inception, but it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_caller_identity"&lt;/span&gt; &lt;span class="s2"&gt;"current"&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_ecr_authorization_token"&lt;/span&gt; &lt;span class="s2"&gt;"token"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;registry_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_caller_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;account_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"docker"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;registry_auth&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;address&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${data.aws_caller_identity.current.account_id}.dkr.ecr.${var.aws_region}.amazonaws.com"&lt;/span&gt;
    &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_ecr_authorization_token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user_name&lt;/span&gt;
    &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_ecr_authorization_token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I hope your &lt;code&gt;aws cli&lt;/code&gt; is already setup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Let Terraform Build the Image (Yes, Really)
&lt;/h2&gt;

&lt;p&gt;I used to build the Docker image separately, tag it manually, push it, then update Terraform. That's three steps too many. The &lt;code&gt;terraform-aws-modules/lambda&lt;/code&gt; module has a docker-build submodule that handles this in one go.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"docker_image"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws//modules/docker-build"&lt;/span&gt;

  &lt;span class="nx"&gt;create_ecr_repo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;ecr_repo&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker"&lt;/span&gt;
  &lt;span class="nx"&gt;use_image_tag&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;image_tag&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image_tag&lt;/span&gt; &lt;span class="c1"&gt;# Keep a track on it&lt;/span&gt;
  &lt;span class="nx"&gt;source_path&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"../apps/workers"&lt;/span&gt; &lt;span class="c1"&gt;# your docker file location&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;IMP:&lt;/strong&gt; The image will only build again if you change the &lt;code&gt;image_tag&lt;/code&gt; not if you change the code, so keep a track on it.&lt;br&gt;
Set &lt;code&gt;create_ecr_repo = true&lt;/code&gt; and Terraform provisions the registry, builds the image, and pushes it. It's terrifyingly convenient. I kept waiting for the catch.&lt;/p&gt;

&lt;p&gt;(There is a catch. We'll get to it in step 4½.)&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 3: Configure the Lambda
&lt;/h2&gt;

&lt;p&gt;AWS defaults are... optimistic. A 3-second timeout and 128MB of RAM works for "Hello World." My worker connects to Postgres, queries an API, and uploads to S3. So I maxed out the timeout to 15 minutes and gave it 1GB of memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_function"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;

  &lt;span class="nx"&gt;function_name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker"&lt;/span&gt;
  &lt;span class="nx"&gt;create_package&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;image_uri&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;docker_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image_uri&lt;/span&gt;
  &lt;span class="nx"&gt;package_type&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Image"&lt;/span&gt;

  &lt;span class="nx"&gt;timeout&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_timeout&lt;/span&gt;  &lt;span class="c1"&gt;# 900 seconds = "Please just finish"&lt;/span&gt;
  &lt;span class="nx"&gt;memory_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_memory&lt;/span&gt;   &lt;span class="c1"&gt;# 1024 MB = "I believe in you"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes, I'm paying for a Lambo to do grocery runs. But it &lt;em&gt;works&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: EventBridge (Or "CloudWatch Events" for Us Old Timers)
&lt;/h2&gt;

&lt;p&gt;I wanted this thing to run weekly. My brain said "CloudWatch cron," but AWS renamed CloudWatch Events to EventBridge five years ago and my muscle memory hasn't caught up.&lt;/p&gt;

&lt;p&gt;The EventBridge module connects your Lambda to a schedule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"eventbridge"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/eventbridge/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"4.2.2"&lt;/span&gt;

  &lt;span class="nx"&gt;create_bus&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;  &lt;span class="c1"&gt;# Don't need a custom event bus&lt;/span&gt;

  &lt;span class="nx"&gt;rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;weekly_sync&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;description&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Trigger ARGO float weekly sync"&lt;/span&gt;
      &lt;span class="nx"&gt;schedule_expression&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;schedule_expression&lt;/span&gt;  &lt;span class="c1"&gt;# cron(0 12 ? * SUN *)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;weekly_sync&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker-lambda"&lt;/span&gt;
        &lt;span class="nx"&gt;arn&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_arn&lt;/span&gt;
        &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"update"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;input&lt;/code&gt; field? It sends a custom JSON payload to your Lambda. My function checks &lt;code&gt;event["operation"]&lt;/code&gt; to know whether it's a scheduled run or a manual invocation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4½: The Permission That Haunts My Dreams ☠️
&lt;/h2&gt;

&lt;p&gt;Here's the "½ step"—the thing that isn't in the main tutorial but will break everything if you skip it.&lt;/p&gt;

&lt;p&gt;EventBridge can &lt;em&gt;say&lt;/em&gt; it will trigger your Lambda, but unless you explicitly give it permission, AWS will silently drop those invocations. No error logs. No failed invocations metric. Just... nothing happening at 2 AM when your cron fires.&lt;/p&gt;

&lt;p&gt;You need the &lt;code&gt;allowed_triggers&lt;/code&gt; block in your Lambda module:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;allowed_triggers&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;EventBridgeRule&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;principal&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"events.amazonaws.com"&lt;/span&gt;
    &lt;span class="nx"&gt;source_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;eventbridge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;eventbridge_rule_arns&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"weekly_sync"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I spent a morning thinking my cron expression was wrong. Nope. The Lambda just... rejected the trigger. Politely. Without telling me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the ½ step because it's invisible until it isn't.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: Environment Variables (AKA "How Many Secrets Can I Fit In Here")
&lt;/h2&gt;

&lt;p&gt;Last the config. Database URLs, S3 credentials, the whole messy reality of "my app needs to talk to things":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;environment_variables&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;PG_WRITE_URL&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pg_write_url&lt;/span&gt;
  &lt;span class="nx"&gt;S3_ACCESS_KEY&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_access_key&lt;/span&gt;
  &lt;span class="nx"&gt;S3_SECRET_KEY&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_secret_key&lt;/span&gt;
  &lt;span class="nx"&gt;S3_ENDPOINT&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_endpoint&lt;/span&gt;
  &lt;span class="nx"&gt;S3_BUCKET_NAME&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_bucket_name&lt;/span&gt;
  &lt;span class="nx"&gt;ARGO_DAC&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;argo_dac&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Are these in AWS Secrets Manager? No. Should they be? Probably. But this is a dev blog, not a security audit. We'll pretend I used Terraform Cloud workspaces with encrypted variables and move on.&lt;/p&gt;




&lt;p&gt;I assume you have already setup your &lt;code&gt;main.tf&lt;/code&gt; so here's the full &lt;code&gt;worker.tf&lt;/code&gt;. Now just setup the env vars and Thanks me later :)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_caller_identity"&lt;/span&gt; &lt;span class="s2"&gt;"current"&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_ecr_authorization_token"&lt;/span&gt; &lt;span class="s2"&gt;"token"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;registry_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_caller_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;account_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"docker"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;registry_auth&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;address&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${data.aws_caller_identity.current.account_id}.dkr.ecr.${var.aws_region}.amazonaws.com"&lt;/span&gt;
    &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_ecr_authorization_token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user_name&lt;/span&gt;
    &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_ecr_authorization_token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_function"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;

  &lt;span class="nx"&gt;function_name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker"&lt;/span&gt;
  &lt;span class="nx"&gt;create_package&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="nx"&gt;image_uri&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;docker_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image_uri&lt;/span&gt;
  &lt;span class="nx"&gt;package_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Image"&lt;/span&gt;

  &lt;span class="nx"&gt;timeout&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_timeout&lt;/span&gt; &lt;span class="c1"&gt;# 15 minutes&lt;/span&gt;
  &lt;span class="nx"&gt;memory_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_memory&lt;/span&gt;  &lt;span class="c1"&gt;# 1GB&lt;/span&gt;

  &lt;span class="nx"&gt;environment_variables&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;PG_WRITE_URL&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pg_write_url&lt;/span&gt;
    &lt;span class="nx"&gt;S3_ACCESS_KEY&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_access_key&lt;/span&gt;
    &lt;span class="nx"&gt;S3_SECRET_KEY&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_secret_key&lt;/span&gt;
    &lt;span class="nx"&gt;S3_ENDPOINT&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_endpoint&lt;/span&gt;
    &lt;span class="nx"&gt;S3_BUCKET_NAME&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s3_bucket_name&lt;/span&gt;
    &lt;span class="nx"&gt;ARGO_DAC&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;argo_dac&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Allow EventBridge to invoke this Lambda&lt;/span&gt;
  &lt;span class="nx"&gt;create_current_version_allowed_triggers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;allowed_triggers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;EventBridgeRule&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;principal&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"events.amazonaws.com"&lt;/span&gt;
      &lt;span class="nx"&gt;source_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;eventbridge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;eventbridge_rule_arns&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"weekly_sync"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"docker_image"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws//modules/docker-build"&lt;/span&gt;

  &lt;span class="nx"&gt;create_ecr_repo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;ecr_repo&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker"&lt;/span&gt;

  &lt;span class="nx"&gt;use_image_tag&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;image_tag&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image_tag&lt;/span&gt;

  &lt;span class="nx"&gt;source_path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"../apps/workers"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"eventbridge"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/eventbridge/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"4.2.2"&lt;/span&gt;

  &lt;span class="nx"&gt;create_bus&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="nx"&gt;rules&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;weekly_sync&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;description&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Trigger ARGO float weekly sync"&lt;/span&gt;
      &lt;span class="nx"&gt;schedule_expression&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;schedule_expression&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Connect Lambda as target&lt;/span&gt;
  &lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;weekly_sync&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker-lambda"&lt;/span&gt;
        &lt;span class="nx"&gt;arn&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_arn&lt;/span&gt;
        &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;operation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"update"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"atlas-worker-scheduler"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can checkout my tf repo structure &lt;a href="https://github.com/Itz-Agasta/Atlas/tree/main/terraform" rel="noopener noreferrer"&gt;here&lt;/a&gt;. &lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>terraform</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Running Goose in Containers (Without Losing Your Mind)</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Sat, 04 Oct 2025 06:29:56 +0000</pubDate>
      <link>https://dev.to/agasta/running-goose-in-containers-without-losing-your-mind-3m8</link>
      <guid>https://dev.to/agasta/running-goose-in-containers-without-losing-your-mind-3m8</guid>
      <description>&lt;p&gt;I'm a huge fan of containers. They're not just a cool buzzword for résumés; they actually save your sanity. And today, I'll show you how to run goose inside Docker. Specifically how to integrate it into CI/CD pipelines, debug containerized workflows, and manage secure deployments at scale. Once you containerize goose, you'll never go back to raw installs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Containerize Goose? The Real Benefits
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://block.github.io/goose/" rel="noopener noreferrer"&gt;Goose&lt;/a&gt; is an AI agent that can automate engineering tasks, build projects from scratch, debug code, and orchestrate complex workflows through the &lt;a href="https://modelcontextprotocol.io/docs/getting-started/intro" rel="noopener noreferrer"&gt;Model Context Protocol&lt;/a&gt; (MCP). But here's why containers unlock its true potential:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For CI/CD Integration:&lt;/strong&gt; Run automated code reviews, documentation generation, and testing across your entire pipeline without environment drift. Imagine having goose automatically review every PR, generate release notes, or validate your infrastructure as code-all running consistently in containers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Team Collaboration:&lt;/strong&gt; Every developer gets the same goose setup, eliminating "works on my machine" problems when sharing AI-powered workflows or debugging sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Production Deployments:&lt;/strong&gt; Scale goose instances horizontally, manage API keys securely, and deploy across multiple environments with confidence.&lt;/p&gt;

&lt;p&gt;In this guide, we'll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Quick deployment with pre-built images&lt;/li&gt;
&lt;li&gt;CI/CD pipeline integration with GitHub Actions and GitLab&lt;/li&gt;
&lt;li&gt;Debugging containerized goose workflows&lt;/li&gt;
&lt;li&gt;Production-ready security and scaling patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The benefits are immediate and compound over time, especially when you're managing API keys for multiple LLM providers and need consistent behavior across environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start: Your First Containerized Workflow
&lt;/h2&gt;

&lt;p&gt;Let's jump straight into a practical example-using goose to analyze and improve a codebase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull the image and analyze your project&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-w&lt;/span&gt; /workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;openai &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gpt-4o &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$OPENAI_API_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   ghcr.io/block/goose:v0.9.3 run &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"Review this code for security issues and suggest improvements"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The ~340MB image contains everything goose needs to analyze your code, suggest improvements, or even refactor entire functions. This same pattern works for documentation generation, test creation, or architectural reviews.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building Your Own Images
&lt;/h3&gt;

&lt;p&gt;Sometimes you need customizations. Maybe you want the bleeding edge from source, or you need additional tools. Building from source is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and build&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;git clone https://github.com/block/goose.git
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;goose
&lt;span class="nv"&gt;$ &lt;/span&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; goose:local &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The build uses multi-stage magic: compiles with Rust's heavy toolchain, then copies just the binary to a minimal Debian runtime. Smart stuff. The Dockerfile even includes Link-Time Optimization (LTO) and binary stripping to keep things lean.&lt;/p&gt;

&lt;p&gt;Pro tip: For development builds with debug symbols, add &lt;code&gt;--build-arg CARGO_PROFILE_RELEASE_STRIP=false&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Goose Effectively
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Basic CLI Usage
&lt;/h3&gt;

&lt;p&gt;Mount your workspace and let goose work its magic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-w&lt;/span&gt; /workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;openai &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gpt-4o &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$OPENAI_API_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   goose:local run &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"Analyze this codebase"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Interactive Sessions
&lt;/h3&gt;

&lt;p&gt;For longer sessions, use the interactive mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-w&lt;/span&gt; /workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;anthropic &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;GOOSE_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;claude-3-5-sonnet-20241022 &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   goose:local session
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Docker Compose for Complex Setups
&lt;/h3&gt;

&lt;p&gt;When you're dealing with multiple services or persistent config, use Docker Compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.8"&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;goose&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ghcr.io/block/goose:latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;GOOSE_PROVIDER=${GOOSE_PROVIDER:-openai}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;GOOSE_MODEL=${GOOSE_MODEL:-gpt-4o}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OPENAI_API_KEY=${OPENAI_API_KEY}&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./workspace:/workspace&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;goose-config:/home/goose/.config/goose&lt;/span&gt;
    &lt;span class="na"&gt;working_dir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/workspace&lt;/span&gt;
    &lt;span class="na"&gt;stdin_open&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;tty&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;goose-config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run with: &lt;code&gt;$ docker-compose run --rm goose session&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration Deep Dive
&lt;/h2&gt;

&lt;p&gt;Goose supports all the usual environment variables: &lt;code&gt;GOOSE_PROVIDER&lt;/code&gt;, &lt;code&gt;GOOSE_MODEL&lt;/code&gt;, and provider-specific keys. The container runs as a non-root user (UID 1000) by default, which is great for security.&lt;/p&gt;

&lt;p&gt;For persistent config, mount the config directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.config/goose:/home/goose/.config/goose &lt;span class="se"&gt;\&lt;/span&gt;
    goose:local configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Need extra tools? The image is based on Debian Bookworm Slim, so you can install what you need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;FROM ghcr.io/block/goose:latest

USER root

RUN apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; vim tmux &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;

USER goose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CI/CD Integration: Where Containers Shine
&lt;/h2&gt;

&lt;p&gt;This is where containerization really pays off. In CI/CD pipelines, you want consistent, isolated environments. Here's how to integrate goose:&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Actions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;analyze&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;container&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ghcr.io/block/goose:latest&lt;/span&gt;
      &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;GOOSE_PROVIDER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openai&lt;/span&gt;
        &lt;span class="na"&gt;GOOSE_MODEL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gpt-4o&lt;/span&gt;
        &lt;span class="na"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.OPENAI_API_KEY }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run goose analysis&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;goose run -t "Review this codebase for security issues"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitLab CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;analyze&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ghcr.io/block/goose:latest&lt;/span&gt;
  &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;GOOSE_PROVIDER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openai&lt;/span&gt;
    &lt;span class="na"&gt;GOOSE_MODEL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gpt-4o&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;goose run -t "Generate documentation for this project"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've used this setup on multiple projects, and it's a game-changer for automated code reviews and documentation generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting: Real-World Issues
&lt;/h2&gt;

&lt;h3&gt;
  
  
  File Permission Problems with Workspace Mounts
&lt;/h3&gt;

&lt;p&gt;When goose can't write to your mounted workspace, it's usually a user ID mismatch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Problem: "Permission denied" when goose tries to create files&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace ghcr.io/block/goose:v0.9.3

&lt;span class="c"&gt;# Solution: Match container user with host user&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-u&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   ghcr.io/block/goose:v0.9.3 run &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"Create a README.md"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Managing Multiple API Keys
&lt;/h3&gt;

&lt;p&gt;When working with multiple LLM providers (OpenAI, Anthropic, Google), passing individual &lt;code&gt;-e&lt;/code&gt; flags becomes unwieldy and error-prone. Instead, use a single &lt;code&gt;.env&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a comprehensive .env file&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; .env &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
GOOSE_PROVIDER=openai
GOOSE_MODEL=gpt-4o
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
GOOGLE_API_KEY=your-google-key
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;# Use the entire file at once&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;--env-file&lt;/span&gt; .env &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace goose:v0.9.3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach is cleaner, more maintainable, and essential for CI/CD environments where you might switch between providers based on cost, availability, or model capabilities. It also keeps sensitive keys out of your shell history and makes it easy to version different configurations for different environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connecting to Local Development Services
&lt;/h3&gt;

&lt;p&gt;When goose needs to interact with local databases, development servers, or APIs running on your host machine, the container's isolated network becomes a barrier.&lt;/p&gt;

&lt;p&gt;Use host networking to bridge this gap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Allow goose to access your local services&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;--network&lt;/span&gt; host &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--env-file&lt;/span&gt; .env &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   ghcr.io/block/goose:v0.9.3 run &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"Test the API endpoints in our local development server"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Security note:&lt;/strong&gt; Only use host networking in development environments. For production, use proper service discovery and networking configurations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Resource Limits
&lt;/h3&gt;

&lt;p&gt;Set memory and CPU limits for production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"2g"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--cpus&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"2"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   goose:local
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents individual containers from consuming excessive resources, ensuring predictable performance and cost control in production environments. This approach supports both horizontal scaling (adding more container instances to distribute load) and vertical scaling (increasing resources per container) based on your specific workload demands and infrastructure constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Debugging Container Issues
&lt;/h3&gt;

&lt;p&gt;When goose behaves unexpectedly in containers, you need to investigate the environment. Here are common debugging scenarios:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inspect the container environment:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Drop into a shell to examine the container&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--entrypoint&lt;/span&gt; bash &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--env-file&lt;/span&gt; .env &lt;span class="se"&gt;\&lt;/span&gt;
   ghcr.io/block/goose:v0.9.3

&lt;span class="c"&gt;# Inside the container, check:&lt;/span&gt;
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;env&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;GOOSE      &lt;span class="c"&gt;# Environment variables&lt;/span&gt;
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;goose &lt;span class="nt"&gt;--version&lt;/span&gt;       &lt;span class="c"&gt;# Binary version&lt;/span&gt;
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt; /workspace     &lt;span class="c"&gt;# File permissions&lt;/span&gt;
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; ~/.config/goose/config.yaml  &lt;span class="c"&gt;# Configuration&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Debug API connectivity issues:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Test network connectivity and API access&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--entrypoint&lt;/span&gt; bash goose:v1.8.0
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;-I&lt;/span&gt; https://api.openai.com/v1/models
&lt;span class="c"&gt;# Verify API keys&lt;/span&gt;
goose@container:~&lt;span class="nv"&gt;$ &lt;/span&gt;goose configure &lt;span class="nt"&gt;--check&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Examine goose logs in verbose mode:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;:/workspace &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--env-file&lt;/span&gt; .env &lt;span class="se"&gt;\&lt;/span&gt;
   ghcr.io/block/goose:v0.9.3 &lt;span class="se"&gt;\&lt;/span&gt;
   &lt;span class="nt"&gt;--verbose&lt;/span&gt; run &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"Your failing command"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This debugging approach helps identify issues with file permissions, network connectivity, API authentication, or configuration problems that might not be obvious from normal goose output.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Platform Builds
&lt;/h3&gt;

&lt;p&gt;For deployment across architectures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;docker buildx build &lt;span class="nt"&gt;--platform&lt;/span&gt; linux/amd64,linux/arm64 &lt;span class="nt"&gt;-t&lt;/span&gt; goose:multi &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Production Considerations
&lt;/h2&gt;

&lt;p&gt;For production deployments, replace all the &lt;code&gt;latest&lt;/code&gt; tags from our examples with specific release versions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of this (unpredictable):&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker pull ghcr.io/block/goose:latest

&lt;span class="c"&gt;# Use this (reproducible):&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;docker pull ghcr.io/block/goose:v1.8.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pinning to specific versions prevents surprises during deployments and makes rollbacks predictable when issues arise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-Up
&lt;/h2&gt;

&lt;p&gt;Containerizing goose has been one of those "why didn't I do this sooner?" moments in my career. It eliminates environment drift, simplifies deployments, and makes your AI-powered workflows truly portable. So, start with the prebuilt image, then customize as you grow. Your future self will thank you.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>docker</category>
      <category>vibecoding</category>
      <category>goose</category>
    </item>
    <item>
      <title>Why You Should Use uv Inside Jupyter Notebooks</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Mon, 18 Aug 2025 13:50:53 +0000</pubDate>
      <link>https://dev.to/agasta/-2llk</link>
      <guid>https://dev.to/agasta/-2llk</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/agasta" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2861497%2F2a297716-0591-4407-9250-7ea374c4c585.jpg" alt="agasta"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/agasta/why-you-should-use-uv-inside-jupyter-notebooks-h03" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Why You Should Use uv Inside Jupyter Notebooks&lt;/h2&gt;
      &lt;h3&gt;Rupam Golui ・ Aug 18&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#python&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#jupyter&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#datascience&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#productivity&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>python</category>
      <category>jupyter</category>
      <category>datascience</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Why You Should Use uv Inside Jupyter Notebooks</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Mon, 18 Aug 2025 13:50:20 +0000</pubDate>
      <link>https://dev.to/agasta/why-you-should-use-uv-inside-jupyter-notebooks-h03</link>
      <guid>https://dev.to/agasta/why-you-should-use-uv-inside-jupyter-notebooks-h03</guid>
      <description>&lt;p&gt;If you’ve ever worked with &lt;strong&gt;Jupyter Notebooks&lt;/strong&gt;, you know the pain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One notebook runs on Python 3.10, another wants 3.11.&lt;/li&gt;
&lt;li&gt;Some depend on &lt;code&gt;torch==2.1.0&lt;/code&gt;, others break unless it’s &lt;code&gt;2.0.1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;And don’t even get me started on dependency conflicts when you’re switching between projects.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional tools like &lt;code&gt;pip&lt;/code&gt; + &lt;code&gt;venv&lt;/code&gt; work fine, but they can feel &lt;strong&gt;slow and clunky&lt;/strong&gt;. Enter &lt;strong&gt;&lt;a href="https://github.com/astral-sh/uv" rel="noopener noreferrer"&gt;&lt;code&gt;uv&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt;, a blazing-fast Python package manager + environment manager.&lt;/p&gt;

&lt;p&gt;Think of &lt;code&gt;uv&lt;/code&gt; as &lt;strong&gt;the next-gen replacement for pip/venv/poetry&lt;/strong&gt;. It makes project setup &lt;strong&gt;faster&lt;/strong&gt;, &lt;strong&gt;cleaner&lt;/strong&gt;, and way more consistent. And yes, you can use it seamlessly with Jupyter Notebooks.&lt;/p&gt;

&lt;p&gt;Here’s how to set it up 👇&lt;/p&gt;




&lt;h2&gt;
  
  
  Setup Instructions
&lt;/h2&gt;

&lt;p&gt;We’ll register a &lt;strong&gt;new Jupyter kernel&lt;/strong&gt; that uses a &lt;code&gt;uv&lt;/code&gt;-managed environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Install project dependencies
&lt;/h3&gt;

&lt;p&gt;Inside your project folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
uv &lt;span class="nb"&gt;sync&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;💡 Requires &lt;code&gt;uv&lt;/code&gt; installed globally. If you don’t have it yet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-LsSf&lt;/span&gt; https://astral.sh/uv/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or follow the official &lt;a href="https://docs.astral.sh/uv/getting-started/installation/" rel="noopener noreferrer"&gt;install guide&lt;/a&gt;. &lt;/p&gt;




&lt;h3&gt;
  
  
  2. Register a Jupyter kernel for your project
&lt;/h3&gt;

&lt;p&gt;For example, let’s say we’re working on &lt;strong&gt;Virtus&lt;/strong&gt; (&lt;a href="https://github.com/Itz-Agasta/Lopt" rel="noopener noreferrer"&gt;my deepfake detection model&lt;/a&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run python &lt;span class="nt"&gt;-m&lt;/span&gt; ipykernel &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--user&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; vitrus &lt;span class="nt"&gt;--display-name&lt;/span&gt; &lt;span class="s2"&gt;"Python (vitrus)"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a new Jupyter kernel tied directly to your &lt;code&gt;uv&lt;/code&gt; environment.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Add the new kernel to Jupyter
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In &lt;strong&gt;PyCharm&lt;/strong&gt;: just open the &lt;code&gt;.ipynb&lt;/code&gt; file and select &lt;strong&gt;Python (myenv)&lt;/strong&gt; from the top-right kernel selector.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fez2dm0bpngzgmsy0xrhd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fez2dm0bpngzgmsy0xrhd.png" alt="Screenshot of PyCharm Jupyter Notebook kernel selector with " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In &lt;strong&gt;Jupyter Lab / Notebook&lt;/strong&gt;: switch kernels from the dropdown menu to use your shiny new environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf88paoiph14nmte44th.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf88paoiph14nmte44th.png" alt="Screenshot of JupyterLab interface showing the kernel selection dropdown, with " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Remove old kernels
&lt;/h3&gt;

&lt;p&gt;After your work is done, Clean up the messy leftovers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;jupyter kernelspec list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output will look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Available kernels:
  python3           /home/you/.local/share/jupyter/kernels/python3
  lopt              /home/you/.local/share/jupyter/kernels/lopt
  vitrus            /home/you/.local/share/jupyter/kernels/vitrus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To remove one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;jupyter kernelspec uninstall vitrus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why bother with &lt;code&gt;uv&lt;/code&gt;?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt;: installs packages ridiculously fast (thanks to Rust).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolation&lt;/strong&gt;: each project gets a clean, dedicated env.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reproducibility&lt;/strong&gt;: &lt;code&gt;uv.lock&lt;/code&gt; means no “works on my machine” nonsense.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jupyter-friendly&lt;/strong&gt;: easy kernel registration, no hacky workarounds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;👉 My recommendation: set up your Jupyter projects with &lt;code&gt;uv&lt;/code&gt;, and you’ll avoid 90% of Python environment headaches. Do it once per project and you’re golden.&lt;/p&gt;

</description>
      <category>python</category>
      <category>jupyter</category>
      <category>datascience</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to Properly Clean Up Docker (and Save Your Sanity)</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Sat, 16 Aug 2025 08:14:14 +0000</pubDate>
      <link>https://dev.to/agasta/how-to-properly-clean-up-docker-and-save-your-sanity-4662</link>
      <guid>https://dev.to/agasta/how-to-properly-clean-up-docker-and-save-your-sanity-4662</guid>
      <description>&lt;p&gt;If you’ve been messing around with Docker for a while, chances are your system has turned into a junkyard of old containers, images, and volumes you don’t even remember creating. Docker is amazing, but it’s also a &lt;strong&gt;hoarder by default&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And let’s be honest. if you’re using &lt;strong&gt;Docker Desktop&lt;/strong&gt;, especially on Mac or Windows, that bloated piece of software is probably eating your RAM for breakfast. My honest advice: &lt;strong&gt;uninstall Docker Desktop&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On &lt;strong&gt;Mac/Linux&lt;/strong&gt; and you still want a GUI? → Check out &lt;a href="https://orbstack.dev" rel="noopener noreferrer"&gt;OrbStack&lt;/a&gt;. Way lighter, way faster.&lt;/li&gt;
&lt;li&gt;Or, better yet: &lt;strong&gt;learn the CLI like a seasoned dev&lt;/strong&gt;. Once you get comfortable, it feels way cleaner and faster.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, let’s talk about how to &lt;strong&gt;completely clean up Docker&lt;/strong&gt;.&lt;br&gt;
👉 I recommend doing this &lt;strong&gt;weekly&lt;/strong&gt; if you’re a heavy Docker user, or at least &lt;strong&gt;twice a month&lt;/strong&gt; to keep things fresh.&lt;/p&gt;


&lt;h2&gt;
  
  
  🐳 Docker Cleanup (Purge Everything)
&lt;/h2&gt;

&lt;p&gt;This will clear out &lt;strong&gt;unused containers, images, networks, volumes&lt;/strong&gt;, and if you want, completely nuke Docker’s system files.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Soft Cleanup (recommended)
&lt;/h3&gt;

&lt;p&gt;The easiest, safest way to clear unused junk:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker system prune &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="nt"&gt;--volumes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-a&lt;/code&gt;: removes &lt;strong&gt;all unused images&lt;/strong&gt; (not just dangling ones).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--volumes&lt;/code&gt;: removes &lt;strong&gt;unused volumes&lt;/strong&gt; too.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of this as Docker’s version of spring cleaning.&lt;br&gt;
&lt;em&gt;(Run this once a week, and Docker won’t ever balloon out of control.)&lt;/em&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  2. Stop and Remove All Containers
&lt;/h3&gt;

&lt;p&gt;Sometimes you just want a &lt;strong&gt;fresh start&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker stop &lt;span class="si"&gt;$(&lt;/span&gt;docker ps &lt;span class="nt"&gt;-aq&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
docker &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;docker ps &lt;span class="nt"&gt;-aq&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3. Remove All Images, Volumes, and Networks
&lt;/h3&gt;

&lt;p&gt;Go full “factory reset” mode on your Docker resources:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker rmi &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;docker images &lt;span class="nt"&gt;-aq&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
docker volume &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;docker volume &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
docker network &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;docker network &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  4. (Optional) Completely Reset Docker
&lt;/h3&gt;

&lt;p&gt;⚠️ Warning: this wipes Docker’s entire state, including cached layers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop docker
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/docker /var/lib/containerd
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start docker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keeps things lean and mean.&lt;/p&gt;




&lt;h3&gt;
  
  
  TL;DR
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Ditch Docker Desktop → Use OrbStack (Mac) or CLI (Linux).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;docker system prune -a --volumes&lt;/code&gt; &lt;strong&gt;once a week&lt;/strong&gt; (or at least twice a month).&lt;/li&gt;
&lt;li&gt;Reset Docker fully only if things are really broken.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your future self (and your SSD) will thank you.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>linux</category>
      <category>devops</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How I Fine-Tuned a Vision Transformer to Spot Deepfakes</title>
      <dc:creator>Rupam Golui</dc:creator>
      <pubDate>Mon, 12 May 2025 14:12:38 +0000</pubDate>
      <link>https://dev.to/agasta/building-virtus-how-i-fine-tuned-a-vision-transformer-to-spot-deepfakes-3fo9</link>
      <guid>https://dev.to/agasta/building-virtus-how-i-fine-tuned-a-vision-transformer-to-spot-deepfakes-3fo9</guid>
      <description>&lt;p&gt;This project started out as a hackathon idea — we wanted to create a practical tool that could detect deepfakes in images with high confidence. The goal? Build a complete multi-model deepfake classification framework that doesn’t just sound cool, but works.&lt;/p&gt;

&lt;p&gt;We named the image model &lt;strong&gt;Virtus&lt;/strong&gt; &lt;em&gt;(because hey, if you're fighting fakes, might as well sound noble)&lt;/em&gt;. I handled the image classification side &amp;amp; devops while others tackled video detection, frontend/backend.&lt;/p&gt;

&lt;p&gt;This post dives into how I built and trained Virtus: the thinking behind the model choice, dataset, training strategies, evaluation, and pushing it to Hugging Face. I’ll also sprinkle in some tips and lessons I picked up along the way — stuff I wish I knew before starting.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Want to skip the reading and jump straight into the code? Here's the full training notebook on&lt;/em&gt; &lt;a href="https://github.com/Itz-Agasta/Lopt/blob/main/models/image/virtus.ipynb" rel="noopener noreferrer"&gt;&lt;strong&gt;GitHub.&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Choosing a Base Model: Why Vision Transformers?
&lt;/h2&gt;

&lt;p&gt;Initially, I considered the usual CNN suspects — &lt;a href="https://en.wikipedia.org/wiki/Residual_neural_network" rel="noopener noreferrer"&gt;ResNet&lt;/a&gt;, &lt;a href="https://en.wikipedia.org/wiki/EfficientNet" rel="noopener noreferrer"&gt;EfficientNet&lt;/a&gt;, all the classics. But deepfakes are tricky. The difference between a real and fake face can be insanely subtle — we're talking fine textures, light inconsistencies, stuff that might get blurred out or overlooked by CNNs.&lt;/p&gt;

&lt;p&gt;So I started digging into &lt;a href="https://en.wikipedia.org/wiki/Vision_transformer" rel="noopener noreferrer"&gt;Vision Transformers (ViTs)&lt;/a&gt; — and let’s just say, I went down the rabbit hole.&lt;/p&gt;

&lt;p&gt;Turns out, ViTs aren't just trendy — they're &lt;strong&gt;built different&lt;/strong&gt;. While CNNs work with pixel grids and sliding filters, ViTs treat images like sequences, kind of like sentences. They split an image into patches (aka "visual tokens") and feed them into a transformer — the same architecture that powers modern NLP models like BERT and GPT.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhqpavigvzz1je4nj5bo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhqpavigvzz1je4nj5bo.png" alt="CNN vs. ViT: FLOPs and throughput comparison of CNN and Vision Transformer Models" width="800" height="313"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;CNN vs. ViT: FLOPs and throughput comparison of CNN and Vision Transformer Models  – &lt;a href="https://arxiv.org/pdf/2012.12556.pdf" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ViTs actually have &lt;strong&gt;weaker inductive bias&lt;/strong&gt; compared to CNNs — which sounds bad, but it means they don’t assume as much about the structure of images. With enough data (or strong augmentations), they learn better generalizations. And here’s the kicker: they can &lt;strong&gt;outperform CNNs with 4x fewer computational resources&lt;/strong&gt;, which was a huge win for my Kaggle GPU budget.&lt;/p&gt;

&lt;p&gt;If you want a more in-depth comparison, check out &lt;a href="https://viso.ai/deep-learning/vision-transformer-vit/" rel="noopener noreferrer"&gt;this awesome article&lt;/a&gt; — seriously, worth a read.&lt;/p&gt;

&lt;p&gt;Eventually, I ended up choosing &lt;code&gt;facebook/deit-base-distilled-patch16-224&lt;/code&gt;, a ViT that’s been distilled from a CNN teacher model. It’s lightweight (only 87M parameters), fast to train, and surprisingly accurate — even outperforming the standard ViT-Base on ImageNet with just 1k classes. Plus, it doesn’t need massive compute or crazy pretraining to get good results, which made it perfect for our hackathon timeline.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0i2cxrzxdbqyjezushvw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0i2cxrzxdbqyjezushvw.png" alt="Vision Transformer ViT Architecture" width="800" height="417"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Vision Transformer ViT Architecture - &lt;a href="https://github.com/google-research/vision_transformer" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you’ve got more GPU headroom or a larger dataset, there are beefier models out there like &lt;code&gt;google/vit-large-patch16-224-in21k&lt;/code&gt;, &lt;code&gt;vit-base-patch32&lt;/code&gt;, or even the &lt;code&gt;384px version of DeiT&lt;/code&gt; — but for this project, I wanted something fast, efficient, and reliable. DeiT hit that sweet spot.&lt;/p&gt;


&lt;h2&gt;
  
  
  Data Preparation: Deepfake Data Is Messy
&lt;/h2&gt;

&lt;p&gt;I started with a &lt;a href="https://www.kaggle.com/datasets/manjilkarki/deepfake-and-real-images" rel="noopener noreferrer"&gt;Kaggle dataset&lt;/a&gt; that had around 190,000 labeled images of real and fake faces — a solid foundation to begin with. On top of that, I manually added a bunch of extra samples I’d collected from other sources to make things a bit more diverse. Everything was organized into &lt;strong&gt;Real/&lt;/strong&gt; and &lt;strong&gt;Fake/&lt;/strong&gt; folders, so loading them with &lt;code&gt;Path.glob&lt;/code&gt; was smooth sailing.&lt;/p&gt;

&lt;p&gt;After loading the dataset, I quickly noticed the class distribution was skewed — one of the classes (either real or fake) had noticeably more images than the other. &lt;strong&gt;&lt;em&gt;That’s not great for training, since the model might just learn to always predict the majority class.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To fix that, I used &lt;a href="https://imbalanced-learn.org/stable/references/generated/imblearn.over_sampling.RandomOverSampler.html" rel="noopener noreferrer"&gt;&lt;code&gt;RandomOverSampler&lt;/code&gt;&lt;/a&gt; to duplicate samples from the underrepresented class. It’s a quick and dirty way to balance things — works surprisingly well for binary classification.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;imblearn.over_sampling&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RandomOverSampler&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;gc&lt;/span&gt;

&lt;span class="c1"&gt;# Separate out the labels before resampling
&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Oversample to balance the classes
&lt;/span&gt;&lt;span class="n"&gt;ros&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RandomOverSampler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;83&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_resampled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ros&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit_resample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Stick the labels back
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;y_resampled&lt;/span&gt;
&lt;span class="n"&gt;gc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Clean up some memory — just to be safe
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now with the classes balanced, I converted the DataFrame into a Hugging Face Dataset object. This makes everything later (transforms, batching, etc.) super smooth. Before that, I mapped string labels (&lt;code&gt;"Real"&lt;/code&gt; / &lt;code&gt;"Fake"&lt;/code&gt;) to numeric IDs (&lt;code&gt;0&lt;/code&gt; / &lt;code&gt;1&lt;/code&gt;). Hugging Face has a ClassLabel feature built exactly for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClassLabel&lt;/span&gt;

&lt;span class="c1"&gt;# Define the class order explicitly
&lt;/span&gt;&lt;span class="n"&gt;labels_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Real&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Fake&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;class_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ClassLabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_classes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;labels_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Label encoding function
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;map_label2id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;class_labels&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;str2int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;

&lt;span class="c1"&gt;# Apply the mapping to the dataset
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;map_label2id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cast_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;class_labels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Ensures label column behaves like an integer class
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, I split the dataset into &lt;code&gt;60% for training&lt;/code&gt; and &lt;code&gt;40% for testing&lt;/code&gt;. I also made sure the label distribution was preserved across both splits using &lt;code&gt;stratify_by_column&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Train-test split with stratified labels
&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stratify_by_column&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;train_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;train&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;test_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;test&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How I Trained Virtus — Step-by-Step
&lt;/h2&gt;

&lt;p&gt;Alright, now comes the fun part — training the model. We’re going to fine-tune &lt;code&gt;facebook/deit-base-distilled-patch16-224&lt;/code&gt; on our deepfake dataset using Hugging Face's &lt;code&gt;Trainer&lt;/code&gt; API.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Quick side note:&lt;/strong&gt; I trained everything inside a Kaggle Notebook using a single &lt;code&gt;NVIDIA P100 GPU&lt;/code&gt;. After some trial runs, I found it performed noticeably better than the &lt;code&gt;T4s&lt;/code&gt; (even the dual T4 setup Kaggle sometimes gives you). Turns out, the P100 has higher memory bandwidth and better raw compute — which really helps when you're fine-tuning ViTs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If Kaggle isn’t your thing, no worries. I also tried &lt;a href="//lightning.ai"&gt;Lightning AI Studio&lt;/a&gt; and &lt;a href="https://studiolab.sagemaker.aws/" rel="noopener noreferrer"&gt;AWS Studio Lab&lt;/a&gt;, and both were solid freemium options. Way more stable than google Colab, honestly. With Colab’s free tier, I kept hitting runtime errors, couldn’t even get a GPU some days, and the lack of persistent storage was a dealbreaker. Hardware quality also felt kinda... meh.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚡ Bonus tip:&lt;/strong&gt; If you’re training locally, try managing your Python environment with &lt;code&gt;uv&lt;/code&gt;. It’s ridiculously fast. Plus, you won’t fall into &lt;code&gt;dependency hell™&lt;/code&gt;, which is honestly half the battle when setting up ML projects.  Follow &lt;a href="https://github.com/Itz-Agasta/Lopt/tree/main/models#setup-instructions" rel="noopener noreferrer"&gt;this tutorial&lt;/a&gt; if you wanna give it a spin — &lt;em&gt;&lt;strong&gt;highly recommend&lt;/strong&gt;&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 1: Preprocessing and Augmentation
&lt;/h3&gt;

&lt;p&gt;Before throwing data at the model, we need to make sure the input images are normalized &lt;em&gt;exactly&lt;/em&gt; the way the pre-trained ViT expects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ViTImageProcessor&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;torchvision.transforms&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Resize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RandomRotation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RandomAdjustSharpness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Normalize&lt;/span&gt;

&lt;span class="n"&gt;model_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facebook/deit-base-distilled-patch16-224&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;processor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ViTImageProcessor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;image_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_std&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image_std&lt;/span&gt;
&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;height&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I defined two sets of transforms — &lt;code&gt;one for training&lt;/code&gt; (with augmentations) and &lt;code&gt;one for validation&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_train_transforms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="nc"&gt;Resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="nc"&gt;RandomRotation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nc"&gt;RandomAdjustSharpness&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nc"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nc"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_std&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;_val_transforms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Compose&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="nc"&gt;Resize&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="nc"&gt;ToTensor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nc"&gt;Normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_mean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_std&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why these augmentations?&lt;/strong&gt; &lt;em&gt;Deepfakes can vary a lot depending on the source. A little rotation and sharpness tweaks help the model generalize to those variations. No augmentation for validation though — we want to keep that clean.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Applying Transforms
&lt;/h3&gt;

&lt;p&gt;I used Hugging Face’s &lt;code&gt;set_transform&lt;/code&gt; method to apply the preprocessing on-the-fly. This keeps RAM usage low and plays nicely with their Dataset objects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;train_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pixel_values&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;_train_transforms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]})&lt;/span&gt;
&lt;span class="n"&gt;test_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pixel_values&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;_val_transforms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Custom Collate Function
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;Trainer&lt;/code&gt; needs batches of images and labels. Here's a simple collate function to stack tensors correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collate_fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;pixel_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pixel_values&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pixel_values&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pixel_values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;labels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Loading the Model
&lt;/h3&gt;

&lt;p&gt;Now we bring in the ViT model with the correct number of labels and label mappings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ViTForImageClassification&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ViTForImageClassification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label2id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Real&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Fake&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id2label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Real&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Fake&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: TrainingArguments
&lt;/h3&gt;

&lt;p&gt;These settings worked great for me — small learning rate, a couple of epochs, early checkpoint saving, etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TrainingArguments&lt;/span&gt;

&lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TrainingArguments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;logging_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;evaluation_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;epoch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;save_strategy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;epoch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;learning_rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1e-6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Tiny learning rate to avoid overshooting on a sensitive task like classification
&lt;/span&gt;    &lt;span class="n"&gt;per_device_train_batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;per_device_eval_batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_train_epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# 2 epochs were enough to converge for my dataset; more can overfit
&lt;/span&gt;    &lt;span class="n"&gt;weight_decay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Helps regularize and reduce overfitting
&lt;/span&gt;    &lt;span class="n"&gt;warmup_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# Linearly ramps up LR at the start for more stable training
&lt;/span&gt;    &lt;span class="n"&gt;load_best_model_at_end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;save_total_limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;report_to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 6: Train Time!
&lt;/h3&gt;

&lt;p&gt;Let’s go!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Trainer&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Trainer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;train_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;train_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;eval_dataset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;test_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;data_collator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collate_fn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Required even if we don’t use text
&lt;/span&gt;    &lt;span class="n"&gt;compute_metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accuracy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label_ids&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It trained in around 2 hours on Kaggle’s GPU runtime and reached &lt;strong&gt;~99.2% accuracy&lt;/strong&gt;. Not bad for just two epochs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Evaluation: Did Virtus Actually Learn Anything?
&lt;/h2&gt;

&lt;p&gt;After training wrapped up, I wanted to be sure the model wasn’t just memorizing the training data. Hugging Face's &lt;code&gt;Trainer&lt;/code&gt; makes it dead simple to evaluate performance on the test set.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Run evaluation on the test set
&lt;/span&gt;&lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gave me some solid metrics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{'eval_loss': 0.0248, 'eval_accuracy': 0.9919, ...}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yeah — over &lt;strong&gt;99% accuracy&lt;/strong&gt;. I double-checked this wasn’t a fluke by inspecting predictions manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Make predictions on test data
&lt;/span&gt;&lt;span class="n"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;trainer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# See predicted vs actual for the first 5 samples
&lt;/span&gt;&lt;span class="n"&gt;preds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label_ids&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Predicted: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id2label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;preds&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | Actual: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;id2label&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And it matched up nicely.&lt;/p&gt;

&lt;p&gt;To dig deeper, I calculated the macro F1 score and plotted the confusion matrix — just to visualize how well the model was doing on both classes.And here's the result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqyi6sur8kvcy69rxxhr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnqyi6sur8kvcy69rxxhr.png" alt="Confusion matrix for virtus" width="682" height="590"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The matrix was basically diagonal — which means the model was nailing both classes. &lt;/p&gt;




&lt;h2&gt;
  
  
  Publishing to Hugging Face: Share It With the World
&lt;/h2&gt;

&lt;p&gt;Once I was happy with Virtus, I wanted to make it public. Hugging Face Hub is the easiest way to share models — and you can even push directly from a Kaggle notebook using &lt;a href="https://www.kaggle.com/discussions/getting-started/450378" rel="noopener noreferrer"&gt;secrets&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First, install the CLI tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;!&lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; huggingface_hub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then authenticate using a token (I stored mine using Kaggle secrets, but you can use &lt;code&gt;huggingface-cli login&lt;/code&gt; locally):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;huggingface_hub&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;login&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_repo&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;kaggle_secrets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;UserSecretsClient&lt;/span&gt;

&lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;UserSecretsClient&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get_secret&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HF_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create your repo (this can be done via the website too, but I like automation):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;create_repo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agasta/virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;private&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, push both the model and its image processor (so others don’t have to guess your preprocessing steps):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForImageClassification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoFeatureExtractor&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForImageClassification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;extractor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoFeatureExtractor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push_to_hub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agasta/virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push_to_hub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agasta/virtus&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Boom. Your model’s live.&lt;/p&gt;

&lt;p&gt;👉 Check it out here: &lt;a href="https://huggingface.co/agasta/virtus" rel="noopener noreferrer"&gt;https://huggingface.co/agasta/virtus&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But we’re not done yet.&lt;/p&gt;

&lt;p&gt;In the next blog, I’ll show you how to wrap this model in a FastAPI-powered backend, make it production-ready, and deploy it like a real-world service — something you can actually integrate into an app or use in a real-time system.&lt;/p&gt;

&lt;p&gt;If you're into this kind of stuff — AI, web3, backend dev, DevOps, hackathon builds — follow me on X &lt;a href="https://x.com/idkAgasta" rel="noopener noreferrer"&gt;@idkAgasta&lt;/a&gt;. I post cool projects, quick tips, and sometimes just chaos.&lt;/p&gt;

&lt;p&gt;Until next time — keep building, keep shipping Nerds.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>python</category>
      <category>hackathon</category>
    </item>
  </channel>
</rss>
