<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shruti Gupta</title>
    <description>The latest articles on DEV Community by Shruti Gupta (@shrutigupta).</description>
    <link>https://dev.to/shrutigupta</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3983674%2Ff0ce2681-023e-4ad9-a529-b5f1b436f51b.png</url>
      <title>DEV Community: Shruti Gupta</title>
      <link>https://dev.to/shrutigupta</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shrutigupta"/>
    <language>en</language>
    <item>
      <title>The Click Behind Every Click - My Biggest Takeaway from a Full Stack Session</title>
      <dc:creator>Shruti Gupta</dc:creator>
      <pubDate>Sat, 04 Jul 2026 06:48:19 +0000</pubDate>
      <link>https://dev.to/shrutigupta/the-click-behind-every-click-my-biggest-takeaway-from-a-full-stack-session-54io</link>
      <guid>https://dev.to/shrutigupta/the-click-behind-every-click-my-biggest-takeaway-from-a-full-stack-session-54io</guid>
      <description>&lt;p&gt;&lt;strong&gt;Have you ever used an app and wondered what actually happens after you click a button?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Honestly, I hadn't.&lt;/p&gt;

&lt;p&gt;I knew terms like &lt;em&gt;frontend&lt;/em&gt;, &lt;em&gt;backend&lt;/em&gt;, &lt;em&gt;database&lt;/em&gt;, and &lt;em&gt;API&lt;/em&gt;, but they always felt like separate pieces of a puzzle.&lt;/p&gt;

&lt;p&gt;After attending the &lt;strong&gt;Full Stack Development&lt;/strong&gt; session as part of the &lt;strong&gt;AWS Summer Builder Cohort 2026&lt;/strong&gt;, conducted by &lt;strong&gt;Sumit Grover&lt;/strong&gt; and &lt;strong&gt;Vridhi Duggal&lt;/strong&gt;, I finally started seeing the complete picture.&lt;/p&gt;

&lt;p&gt;One thing I really liked was that the session wasn't about introducing fancy technologies. Instead, it answered a much more interesting question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What actually happens behind the scenes?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A single user request isn't just a click on a webpage.&lt;/p&gt;

&lt;p&gt;It travels through the frontend, reaches the backend, interacts with databases, sometimes checks the cache first, processes the required logic, and finally returns the response we see on our screen.&lt;/p&gt;

&lt;p&gt;I've used hundreds of applications, but this was probably the first time I paused and thought about everything happening in those few milliseconds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsf1v1z09843m84g32to0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsf1v1z09843m84g32to0.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another concept that genuinely clicked for me was &lt;strong&gt;caching&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Earlier, I simply knew that caching makes applications faster.&lt;/p&gt;

&lt;p&gt;The explanation during the session completely changed that understanding.&lt;/p&gt;

&lt;p&gt;If thousands of users are requesting the same data repeatedly, why keep asking the database every single time?&lt;/p&gt;

&lt;p&gt;Store frequently accessed data in memory, reduce unnecessary database calls, lower latency, and let the database handle requests that actually require it.&lt;/p&gt;

&lt;p&gt;Such a simple idea.&lt;/p&gt;

&lt;p&gt;Such a huge impact.&lt;/p&gt;

&lt;p&gt;Another interesting discussion was around scalability.&lt;/p&gt;

&lt;p&gt;Building an application that works for 50 users is one thing.&lt;/p&gt;

&lt;p&gt;Building one that continues to work smoothly for thousands or even millions of users is a completely different challenge. It made me realize that writing code is only one part of software development - designing systems that can grow is equally important.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmltpgczsqq8j7yiydov0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fmltpgczsqq8j7yiydov0.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apart from the technical content, I really appreciated the way &lt;strong&gt;Sumit Grover&lt;/strong&gt; and &lt;strong&gt;Vridhi Duggal&lt;/strong&gt; conducted the session.&lt;/p&gt;

&lt;p&gt;The coordination between them was seamless. It never felt like two speakers taking turns. Every topic naturally connected to the next, making the entire session feel more like an engaging conversation than a presentation.&lt;/p&gt;

&lt;p&gt;I joined the session expecting to learn about full stack development.&lt;/p&gt;

&lt;p&gt;I left with something much more valuable - a better understanding of how all the pieces come together to build the applications we use every day.&lt;/p&gt;

&lt;p&gt;A big thank you to &lt;strong&gt;Sumit Grover&lt;/strong&gt;, &lt;strong&gt;Vridhi Duggal&lt;/strong&gt;, and the entire &lt;strong&gt;AWS Summer Builder Cohort 2026&lt;/strong&gt; team for such an insightful session.&lt;/p&gt;

&lt;p&gt;Already looking forward to the next one! 🚀&lt;/p&gt;

&lt;h1&gt;
  
  
  AWSSummerBuilderCohort2026 #AWS #FullStackDevelopment #LearningInPublic #Backend #WebDevelopment #CloudComputing #Students #Tech
&lt;/h1&gt;

</description>
      <category>fullstack</category>
      <category>aws</category>
      <category>learning</category>
      <category>session</category>
    </item>
    <item>
      <title>I Thought I Was Learning AWS Services. I Was Actually Learning to Solve Problems.</title>
      <dc:creator>Shruti Gupta</dc:creator>
      <pubDate>Fri, 03 Jul 2026 20:55:55 +0000</pubDate>
      <link>https://dev.to/shrutigupta/i-thought-i-was-learning-aws-services-i-was-actually-learning-to-solve-problems-en0</link>
      <guid>https://dev.to/shrutigupta/i-thought-i-was-learning-aws-services-i-was-actually-learning-to-solve-problems-en0</guid>
      <description>&lt;p&gt;When I joined this AWS challenge, I expected one thing - a lot of new service names.&lt;/p&gt;

&lt;p&gt;Amazon S3. Amazon RDS. Amazon DynamoDB. Amazon SNS. Amazon Bedrock.&lt;/p&gt;

&lt;p&gt;Like many beginners, I thought the goal would be simple: learn what each service does, remember its definition, complete the challenge, and move on.&lt;/p&gt;

&lt;p&gt;What I didn't realize was that each week's challenge was quietly building on the previous one.&lt;/p&gt;

&lt;p&gt;Looking back, I don't think these three weeks were just about learning AWS. They were about changing the way I approach technical problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Week 1 - Stop Memorizing. Start Asking "Why?"
&lt;/h2&gt;

&lt;p&gt;The first week's task was to understand five AWS services, explain them in our own words, think of real-life use cases, and share one feature we found interesting.&lt;/p&gt;

&lt;p&gt;At first, I approached it the way I usually prepare for a new topic - read, understand, and remember.&lt;/p&gt;

&lt;p&gt;But I noticed something.&lt;/p&gt;

&lt;p&gt;The more I tried to remember definitions, the more everything started sounding similar. Then I changed my approach and asked myself one simple question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Why would someone even build this service?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That single question made everything much easier to understand.&lt;/p&gt;

&lt;p&gt;Instead of seeing Amazon S3 as just another storage service, I imagined the millions of photos and videos uploaded every second on platforms like Instagram or YouTube.&lt;/p&gt;

&lt;p&gt;Instead of treating Amazon SNS as a notification service, I pictured a hackathon platform sending registration confirmations, mentor updates, deadline reminders, and final results to thousands of students.&lt;/p&gt;

&lt;p&gt;Instead of memorizing what Amazon Bedrock does, I imagined building an AI assistant that could help students discover hackathons based on their interests.&lt;/p&gt;

&lt;p&gt;Those examples were simply my way of connecting technical concepts with situations I could actually relate to.&lt;/p&gt;

&lt;p&gt;That was my first mindset shift.&lt;/p&gt;

&lt;p&gt;Technology becomes much easier to understand when you stop memorizing features and start thinking about the problems they were created to solve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Week 2 - Knowing the Tool Isn't Enough
&lt;/h2&gt;

&lt;p&gt;The second week's challenge felt completely different.&lt;/p&gt;

&lt;p&gt;This time, we weren't asked to explain services.&lt;/p&gt;

&lt;p&gt;Instead, we were given different scenarios and had to choose the most suitable AWS service, along with the reasoning behind our choice.&lt;/p&gt;

&lt;p&gt;Initially, I expected every question to have one obvious answer.&lt;/p&gt;

&lt;p&gt;It didn't.&lt;/p&gt;

&lt;p&gt;I found myself comparing services, thinking about trade-offs, and asking questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why is this service a better fit than another?&lt;/li&gt;
&lt;li&gt;What exactly is the requirement here?&lt;/li&gt;
&lt;li&gt;Am I solving the right problem?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the first time, I realized that learning technology isn't just about knowing what different tools can do.&lt;/p&gt;

&lt;p&gt;It's about understanding &lt;em&gt;why&lt;/em&gt; one solution makes more sense than another.&lt;/p&gt;

&lt;h2&gt;
  
  
  Week 3 - Understand the Problem Before the Solution
&lt;/h2&gt;

&lt;p&gt;The third week's challenge was probably my favorite.&lt;/p&gt;

&lt;p&gt;Instead of directly choosing an AWS service, we were given bug scenarios.&lt;/p&gt;

&lt;p&gt;Our task was to identify what had actually gone wrong, decide which AWS service could help, and explain what we would tell the development team to fix the issue.&lt;/p&gt;

&lt;p&gt;This completely changed the order in which I started thinking.&lt;/p&gt;

&lt;p&gt;Earlier, my first thought used to be:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Which AWS service fits here?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After this challenge, my first thought became:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What is the actual problem?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It sounds like a small difference.&lt;/p&gt;

&lt;p&gt;But I think it's one of the most important lessons I've learned.&lt;/p&gt;

&lt;p&gt;Because if you misunderstand the problem, even the best technology won't give you the right solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking Back
&lt;/h2&gt;

&lt;p&gt;Three weeks ago, I would've looked at a problem and wondered, "Which AWS service should I use?"&lt;/p&gt;

&lt;p&gt;Today, I'd probably ask, "What exactly is the problem I'm trying to solve?"&lt;/p&gt;

&lt;p&gt;Instead of memorizing services, I try to understand why they exist.&lt;/p&gt;

&lt;p&gt;Instead of searching for the "correct" solution immediately, I spend more time understanding the requirements.&lt;/p&gt;

&lt;p&gt;Instead of jumping to conclusions, I try to identify the root cause first.&lt;/p&gt;

&lt;p&gt;Looking back, I don't think these three weeks were really about AWS.&lt;/p&gt;

&lt;p&gt;They were about learning a different way to think.&lt;/p&gt;

&lt;p&gt;I know I've only scratched the surface of cloud computing, and there's still a lot left to explore.&lt;/p&gt;

&lt;p&gt;But these challenges gave me something I'll carry beyond AWS.&lt;/p&gt;

&lt;p&gt;Whether I'm working on an open-source issue, building a college project, participating in a hackathon, or learning a completely new technology, I think I'll always start with the same question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What problem am I actually trying to solve?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Earlier, I used to ask,&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What does this technology do?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now I find myself asking,&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Why was this technology built in the first place?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Surprisingly, that one question has made learning feel much less overwhelming and a lot more meaningful.&lt;/p&gt;

&lt;p&gt;And for me, that's been the biggest takeaway from these three weeks.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>challenge</category>
      <category>productivity</category>
      <category>learning</category>
    </item>
    <item>
      <title>Beyond AWS Service Names: Understanding the Problems They Actually Solve</title>
      <dc:creator>Shruti Gupta</dc:creator>
      <pubDate>Fri, 26 Jun 2026 17:15:50 +0000</pubDate>
      <link>https://dev.to/shrutigupta/beyond-aws-service-names-understanding-the-problems-they-actually-solve-100</link>
      <guid>https://dev.to/shrutigupta/beyond-aws-service-names-understanding-the-problems-they-actually-solve-100</guid>
      <description>&lt;p&gt;&lt;strong&gt;My Learning Journey with AWS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I first saw AWS, I honestly felt overwhelmed.&lt;/p&gt;

&lt;p&gt;There were so many services- S3, RDS, DynamoDB, SNS, Bedrock—and they all sounded important, but I couldn't understand why AWS needed so many different services. At one point, I even thought, "Can't one database or one storage service do everything?"&lt;/p&gt;

&lt;p&gt;While exploring these services for an AWS challenge, I stopped trying to memorize definitions and instead asked myself one simple question:&lt;/p&gt;

&lt;p&gt;"What real-world problem is this service trying to solve?"&lt;/p&gt;

&lt;p&gt;That completely changed how I understood AWS.&lt;/p&gt;

&lt;p&gt;Here are the five services that helped me look at cloud computing differently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Bedrock&lt;/strong&gt;:- AI Without Building an AI Model Yourself&lt;br&gt;
When I first heard about Amazon Bedrock, I thought it was used to create and train AI models from scratch.&lt;/p&gt;

&lt;p&gt;After learning more, I realized I had misunderstood it.&lt;/p&gt;

&lt;p&gt;Bedrock is more about using existing foundation models and customizing them with your own data to build AI applications. AWS takes care of the complex infrastructure, while developers focus on solving actual problems.&lt;/p&gt;

&lt;p&gt;The first idea that came to my mind was a Hackathon Discovery Platform.&lt;/p&gt;

&lt;p&gt;Students usually spend hours searching different websites to find suitable hackathons. Instead, imagine asking an AI assistant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;"Find beginner-friendly AI hackathons."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Which hackathons allow solo participation?"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Show hackathons with prize money above ₹50,000."_&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The assistant could understand information like eligibility, deadlines, FAQs, technologies used, and previous editions because it is connected to a knowledge base.&lt;/p&gt;

&lt;p&gt;That was the moment I understood what Bedrock is actually meant for- not creating AI models, but creating useful AI applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon S3&lt;/strong&gt;:- Not Just Storage, but Storage That Never Becomes a Problem&lt;br&gt;
Before learning about Amazon S3, cloud storage sounded similar to Google Drive.&lt;/p&gt;

&lt;p&gt;Then I realized companies deal with millions and sometimes billions of files.&lt;/p&gt;

&lt;p&gt;Photos.&lt;/p&gt;

&lt;p&gt;Videos.&lt;/p&gt;

&lt;p&gt;Documents.&lt;/p&gt;

&lt;p&gt;Backups.&lt;/p&gt;

&lt;p&gt;Datasets.&lt;/p&gt;

&lt;p&gt;Managing all of that isn't as simple as saving files on a computer.&lt;/p&gt;

&lt;p&gt;The first example I thought of was social media platforms like Instagram, YouTube, and Snapchat. Every second, users upload content, and all those media files need to be stored somewhere reliable.&lt;/p&gt;

&lt;p&gt;Another idea I had was a college event archive where photographs, certificates, recordings, and event documents could be safely stored and accessed even years later.&lt;/p&gt;

&lt;p&gt;S3 made me realize that storage isn't only about saving files. It's about making sure those files remain secure, durable, and available whenever they're needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon RDS&lt;/strong&gt;:– Let AWS Handle the Database Work&lt;br&gt;
I had worked with databases before, but I never really thought about what happens behind the scenes.&lt;/p&gt;

&lt;p&gt;I assumed databases just "store data."&lt;/p&gt;

&lt;p&gt;Later I learned someone has to maintain them too.&lt;/p&gt;

&lt;p&gt;Someone has to manage backups.&lt;/p&gt;

&lt;p&gt;Someone has to install updates.&lt;/p&gt;

&lt;p&gt;Someone has to recover data if something goes wrong.&lt;/p&gt;

&lt;p&gt;That's where Amazon RDS made sense to me.&lt;/p&gt;

&lt;p&gt;The example that immediately clicked was a University Examination Management System.&lt;/p&gt;

&lt;p&gt;Student records, attendance, marks, course details, and examination results all have relationships with each other, making a relational database the right choice.&lt;/p&gt;

&lt;p&gt;I also liked the fact that RDS can automatically create backups and even maintain a standby database in another Availability Zone. It made me realize why organizations trust managed databases instead of maintaining everything themselves.&lt;/p&gt;

&lt;p&gt;Since educational data is sensitive, features like encryption and access controls also help reduce the risk of unauthorized access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DynamoDB&lt;/strong&gt;:- When Millions of People Use Your App Together&lt;br&gt;
Understanding DynamoDB also helped me understand that not every database is built for the same purpose.&lt;/p&gt;

&lt;p&gt;Initially, I wondered why AWS had both RDS and DynamoDB.&lt;/p&gt;

&lt;p&gt;Then I thought about LinkedIn.&lt;/p&gt;

&lt;p&gt;People continuously like posts.&lt;/p&gt;

&lt;p&gt;Comment.&lt;/p&gt;

&lt;p&gt;React.&lt;/p&gt;

&lt;p&gt;Send connection requests.&lt;/p&gt;

&lt;p&gt;View profiles.&lt;/p&gt;

&lt;p&gt;All these activities happen at an enormous scale.&lt;/p&gt;

&lt;p&gt;A traditional relational database isn't always the best fit for this kind of workload.&lt;/p&gt;

&lt;p&gt;That's where DynamoDB comes in.&lt;/p&gt;

&lt;p&gt;It automatically scales while still responding incredibly fast, even if millions of users are active at the same time.&lt;/p&gt;

&lt;p&gt;One feature I found particularly interesting was Global Tables, where data stays synchronized across multiple AWS Regions, making applications more reliable for users around the world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon SNS&lt;/strong&gt;:- One Update, Many Notifications&lt;br&gt;
SNS became one of the easiest services for me to visualize because we all receive notifications every day.&lt;/p&gt;

&lt;p&gt;What I liked most was the Publish-Subscribe idea.&lt;/p&gt;

&lt;p&gt;Instead of sending the same update separately to different places, an application publishes one message and SNS distributes it wherever it's needed.&lt;/p&gt;

&lt;p&gt;The example I came up with was a Hackathon Management Platform.&lt;/p&gt;

&lt;p&gt;Whenever students register, form teams, receive mentor session details, meeting schedules, deadlines, or final results, SNS could send notifications through email, SMS, mobile notifications, and the platform itself at the same time.&lt;/p&gt;

&lt;p&gt;The same idea could also be used inside organizations to send important meeting announcements across different communication channels.&lt;/p&gt;

&lt;p&gt;Another interesting thing I learned was that SNS supports both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Application-to-Person (A2P)&lt;/strong&gt; communication, like emails and SMS.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Application-to-Application (A2A)&lt;/strong&gt; communication, where different software systems notify each other automatically.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That made me realize SNS is much more than a notification service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Changed for Me?&lt;/strong&gt;&lt;br&gt;
Before this challenge, I used to think AWS was just a collection of complicated cloud services.&lt;/p&gt;

&lt;p&gt;Now I see each service as a solution to a specific problem.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If I need to store huge amounts of files, I think about Amazon S3.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If I need structured data with relationships, Amazon RDS makes sense.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If I need a database for millions of fast user interactions, DynamoDB is a better choice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If I need to notify people or systems whenever something happens, I think of Amazon SNS.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If I want to build an AI-powered application without worrying about training large models, Amazon Bedrock is the service I would explore.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The biggest lesson I learned wasn't remembering AWS service names.&lt;/p&gt;

&lt;p&gt;It was understanding why each service exists.&lt;/p&gt;

&lt;p&gt;Once I started thinking in terms of "What problem am I trying to solve?", AWS became much less intimidating and much more practical.&lt;/p&gt;

&lt;p&gt;This is only the beginning of my cloud journey, but now when I hear the name of an AWS service, I don't just remember its definition, I remember the real-world problem it can solve.&lt;/p&gt;

&lt;p&gt;I know I've only scratched the surface of AWS, but this challenge changed the way I learn technology. Instead of memorizing concepts, I now try to connect every new service with a real-world problem. That approach made learning much more meaningful for me.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>beginners</category>
      <category>cloudcomputing</category>
      <category>learning</category>
    </item>
    <item>
      <title>Your On-Call Agent Forgot Everything. Ours Doesn't.</title>
      <dc:creator>Shruti Gupta</dc:creator>
      <pubDate>Sun, 14 Jun 2026 10:38:35 +0000</pubDate>
      <link>https://dev.to/shrutigupta/your-on-call-agent-forgot-everything-ours-doesnt-ppj</link>
      <guid>https://dev.to/shrutigupta/your-on-call-agent-forgot-everything-ours-doesnt-ppj</guid>
      <description>&lt;p&gt;The first time I used something that actually remembered a past production failure, I didn't fully trust it. I submitted the same incident twice just to make sure the result wasn't a coincidence.&lt;/p&gt;

&lt;p&gt;It wasn't.&lt;/p&gt;

&lt;p&gt;I was building On-Call Copilot — an incident response agent that doesn't just generate advice, it recalls what actually happened the last time something similar broke. The live app is at &lt;a href="https://on-call-copilot.vercel.app" rel="noopener noreferrer"&gt;on-call-copilot.vercel.app&lt;/a&gt;. The memory layer is &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt;. And the thing that surprised me most wasn't how hard it was to integrate — it was how immediately obvious the difference was once it was working.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsyuzh2oytg283q45nmr2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsyuzh2oytg283q45nmr2.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What the system actually does
&lt;/h2&gt;

&lt;p&gt;On-Call Copilot is an AI Incident Commander with organizational memory. The tagline on the app is "Learn from every outage. Resolve the next one faster." That's not marketing — it's literally the architecture.&lt;/p&gt;

&lt;p&gt;When a production alert comes in — a Sentry traceback, a Datadog trigger, raw CLI logs — you paste it into the Incident Ingestion Console. The system runs it through a five-stage pipeline:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production Alert → FastAPI Router → Hindsight Memory → Groq Reasoning → SRE Playbook&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stage three is the one that matters. Before Groq generates anything, Hindsight's semantic graph runs a recall against the full organizational incident history. It doesn't do keyword search. It finds semantically related past incidents — things that failed for the same underlying reason, even if the error messages look different on the surface.&lt;/p&gt;

&lt;p&gt;What comes back isn't just "here's a similar incident." It's structured: historical root cause, successful fix, and critically — failed attempts to avoid. Things someone already tried that made it worse. That last part is what makes this different from any generic LLM response.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzwos8ljfe3hmdow8n7ot.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzwos8ljfe3hmdow8n7ot.png" alt=" " width="676" height="835"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The FastAPI layer: how the triage request flows
&lt;/h2&gt;

&lt;p&gt;Every incident starts at a single POST endpoint. The frontend sends the raw alert text; FastAPI handles the orchestration — first pulling memory context from Hindsight, then passing that context alongside the alert into the Groq reasoning chain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# backend/api.py
&lt;/span&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/analyze&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IncidentRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;analyze_incident&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;incident&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/teach&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;teach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IncidentRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;store_incident&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;incident&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;saved&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@app.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;home&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;On-Call Copilot API Running 🚀&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ordering is the key design decision. The recall happens &lt;em&gt;before&lt;/em&gt; the LLM sees anything. By the time Groq is reasoning about root cause, it already has the organizational context baked in — not as a separate lookup, but as part of the prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Hindsight's recall actually returns
&lt;/h2&gt;

&lt;p&gt;I had never used &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; before this project. My mental model going in was that it would behave like search — give it keywords, get back matching documents.&lt;/p&gt;

&lt;p&gt;What it actually does is closer to semantic reasoning over a knowledge graph. When I submitted "FATAL: database pool choked during active transaction," it recalled two past incidents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;INC-103&lt;/strong&gt; — Database connection pool exhaustion under high transactional traffic. 91% match. Successful fix: increment proxy pool limits to 50, implement transaction timeout safeguards. Failed attempt: scaling pool replicas dynamically (triggered DB lock storms).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;INC-104&lt;/strong&gt; — Redis cluster memory allocation overrun. 87% match. Successful fix: configure maxmemory-policy to volatile-lru. Failed attempt: cold restarts of Redis service (nuked all active sessions).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The match percentages are real confidence scores from &lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;Hindsight's agent memory&lt;/a&gt; system. The "failed attempt" field is the part that earns its keep at 3 AM — it tells you what not to reach for before you waste 40 minutes on it.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3q34kmvr0vlpvjm9hkep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3q34kmvr0vlpvjm9hkep.png" alt=" " width="799" height="305"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The two memory operations: retain and recall
&lt;/h2&gt;

&lt;p&gt;The entire Hindsight integration in &lt;code&gt;backend/memory.py&lt;/code&gt; is built on two calls. Here's both of them side by side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# backend/memory.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;contextlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;contextmanager&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;hindsight&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HindsightClient&lt;/span&gt;

&lt;span class="n"&gt;BANK_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BANK_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@contextmanager&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_hindsight_client&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HindsightClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HINDSIGHT_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;recall_similar_incidents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;incident_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;_get_hindsight_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;bank_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BANK_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;incident_description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_resolution&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;incident_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resolution_summary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INCIDENT: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;incident_description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RESOLUTION: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resolution_summary&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;_get_hindsight_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;bank_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BANK_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;incident_postmortem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two functions. One call each. The entire organizational memory layer is those ~30 lines. What the &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight retain/recall API&lt;/a&gt; does behind the scenes — semantic indexing, graph traversal, confidence scoring — you get all of that for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pipeline in practice
&lt;/h2&gt;

&lt;p&gt;The Incident Resolution Timeline in the UI makes the pipeline visible in real time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Alert Received&lt;/strong&gt; — raw metrics or trace ingested into buffer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Retrieved&lt;/strong&gt; — FastAPI semantic correlation against regional index maps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Root Cause Identified&lt;/strong&gt; — LLM isolates anomalies, computes match confidence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resolution Suggested&lt;/strong&gt; — detailed playbook with avoidance warnings generated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge Stored&lt;/strong&gt; — post-mortem answers indexed back into organizational memory&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That fifth step is the learning loop. Every resolved incident feeds back into Hindsight via &lt;code&gt;retain()&lt;/code&gt;. The next similar incident pulls it as recalled context. The system gets more specific over time — not because the model changed, but because the memory bank grew.&lt;/p&gt;

&lt;h2&gt;
  
  
  Before vs after — what memory actually changes
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Without organizational memory:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generic advice pulled from training data — "check network adapters," "reinstall OS"&lt;/li&gt;
&lt;li&gt;No awareness of what's already been tried in your specific environment&lt;/li&gt;
&lt;li&gt;Suggestions that have failed twice before in your cluster show up again&lt;/li&gt;
&lt;li&gt;Every incident starts from zero&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;With Hindsight memory (150 incidents in the knowledge base):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Precise matches pulled from actual past outages, not textbook examples&lt;/li&gt;
&lt;li&gt;Failed fixes flagged explicitly so engineers don't repeat them&lt;/li&gt;
&lt;li&gt;One-step indexing after resolution so the next incident benefits immediately&lt;/li&gt;
&lt;li&gt;42% estimated reduction in mean time to resolution shown live on the dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That 42% figure isn't a benchmark I'm claiming — it's what the dashboard shows based on the system's historical recall performance across the loaded incident knowledge base.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Teach the System panel does
&lt;/h2&gt;

&lt;p&gt;At the bottom of the app is a section called "Teach the System — Training Mode." Paste a resolution summary or an incident link, submit it, and Hindsight indexes it immediately. The Telemetry Console logs the whole thing in real time — you can watch the retain call go out, see the success response, and know that the next engineer who hits a similar issue will get this resolution in their recalled context.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnhcuv1w2fs175375o7m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnhcuv1w2fs175375o7m.png" alt=" " width="800" height="212"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;The log from a real session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;[1:20:26 pm] SUCCESS [OUTPOST] Triage finished successfully in 44.97s. Status 200 OK.
[1:20:27 pm] SUCCESS [OUTPOST] Taught system successfully in 39.57s. Status 200 OK.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Triage and teach. Those two operations are the entire product loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  What using Hindsight taught me
&lt;/h2&gt;

&lt;p&gt;I went into this thinking about memory as a storage problem. I came out thinking about it as a retrieval design problem. What you store matters less than whether the right things surface at the right time.&lt;/p&gt;

&lt;p&gt;Hindsight's &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;retain/recall API&lt;/a&gt; is small — two core operations cover almost everything. But the quality of what you get back at recall time depends entirely on how well-structured the retained content is. A postmortem that clearly separates root cause, successful fix, and failed attempts produces recall that's immediately actionable. A vague free-text summary produces noise.&lt;/p&gt;

&lt;p&gt;The other thing I'd do differently is seed the knowledge base earlier. The system only becomes convincingly better than a generic LLM once there's enough incident history to surface precise matches. With 150 incidents loaded, the difference is stark. With 5, it's marginal. Data quality and quantity are part of the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;The current "Teach the System" input is free text. The obvious next step is a structured form — separate fields for root cause, fix steps, failed attempts, and the customer message that generated the least confusion. Structured inputs produce more consistent memories, which produce more reliable recall.&lt;/p&gt;

&lt;p&gt;The architecture also has room to expand beyond a single knowledge bank. Right now there's one organizational memory shared across all incidents. A multi-tenant version with per-team or per-service memory banks would let different engineering teams maintain separate incident histories while still being able to query across them when needed.&lt;/p&gt;

&lt;p&gt;The memory layer works. What I keep thinking about is how much better it gets with every incident that runs through it — and how most engineering teams are sitting on years of incident history that a system like this could immediately put to use.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>python</category>
    </item>
  </channel>
</rss>
