<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: M.Shahmir Khan Afridi</title>
    <description>The latest articles on DEV Community by M.Shahmir Khan Afridi (@mshahmir_khanafridi_b91).</description>
    <link>https://dev.to/mshahmir_khanafridi_b91</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3806851%2F33949cb5-fef9-4079-83a1-688469899f1a.png</url>
      <title>DEV Community: M.Shahmir Khan Afridi</title>
      <link>https://dev.to/mshahmir_khanafridi_b91</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mshahmir_khanafridi_b91"/>
    <language>en</language>
    <item>
      <title>Agentic AI and Search Agents</title>
      <dc:creator>M.Shahmir Khan Afridi</dc:creator>
      <pubDate>Tue, 10 Mar 2026 16:36:39 +0000</pubDate>
      <link>https://dev.to/mshahmir_khanafridi_b91/agentic-ai-and-search-agents-2da4</link>
      <guid>https://dev.to/mshahmir_khanafridi_b91/agentic-ai-and-search-agents-2da4</guid>
      <description>&lt;p&gt;— okay this was actually kind of  intriguing &lt;/p&gt;

&lt;p&gt;so I have been putting this off for like three days. eventually sat down with both papers last night and I am going to be honest the first one( the MDPI one) took me  ever to get through because the  preface is just  thick. like they define the same thing four different ways before actually moving on. but  formerly I got past that it was fine. &lt;/p&gt;

&lt;p&gt;I used NotebookLM to  induce a summary before reading which actually I do for  utmost heavy papers now. the summary was decent but I will talk about that  further at the end because I actually have  studies on it. &lt;/p&gt;

&lt;p&gt;anyway. agentic AI. &lt;/p&gt;




&lt;p&gt;the  introductory idea and why it's different from normal AI &lt;/p&gt;

&lt;p&gt;okay so in our AI class we have talked a lot about agents — how they perceive  effects, make  opinions, take  conduct. the whole sense- think- act thing. what these papers are describing is  principally that but taken way further than the  text  interpretation. &lt;/p&gt;

&lt;p&gt;normal LLMs( like the bones we have  bandied in class) are reactive. you put  commodity in,  commodity comes out, done. agentic AI is different because the system can actually pursue a  thing over multiple  way without you holding its hand the whole time. it setssub-goals on its own, picks tools to use, executes  conduct, checks the results, adjusts. the whole thing runs with  minimum  mortal involvement. &lt;/p&gt;

&lt;p&gt;which sounds simple when I write it like that but the  factual  perpetration is way more complicated and that is kind of what both papers are about. &lt;/p&gt;

&lt;p&gt;the MDPI paper is  further of a big picture review — delineations,  fabrics,  infrastructures,  operations, challenges. the arXiv one( 2508.05668) is more focused specifically on hunt agents and gets into training methodology and  marks. they lap a lot but they are doing different  effects and I  suppose my  original LLM summary kind of missed that distinction which I will get to. &lt;/p&gt;




&lt;p&gt;infrastructures — the part I actually  set up  intriguing &lt;/p&gt;

&lt;p&gt;the MDPI paper goes through several architectural models and some of them connected really well to stuff we have covered. &lt;/p&gt;

&lt;p&gt;the ReAct model is  principally a  circle. reason, act, observe what  happed, reason again. it's fast and works well for simpler tasks. reminded me of the  introductory agent cycle from class actually, like it's the most direct  perpetration of sense- think- act you can get. &lt;/p&gt;

&lt;p&gt;also there is the Hierarchical/ Supervisor model which is more  intriguing to me. you have a main agent at the top that breaks a problem into pieces and hands them off to technicalsub-agents  under. so like if the task is" write a  request  exploration report" the  administrator does not do all of it — it delegates the web searching to one agent, the data analysis to another, the jotting to another. this maps ontomulti-agent systems which we touched on in class and makes a lot  further sense to me now as an  factual practical thing rather than just a conception. &lt;/p&gt;

&lt;p&gt;the BDI armature was the bone&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         I kept coming back to. Belief Desire Intention. we mentioned this in class briefly and I actually did not completely get it at the time but reading it in  environment helped. the point is that you can look at the agent's beliefs( what it thinks is true about the world), its  solicitations( what it wants to achieve), and its intentions( what it's committed to doing right now) and actually trace why it made a decision. which is a big deal for responsibility and  translucency — two  effects we have talked about a lot in the AI ethics portions of the course. like yes the agent is  independent but at least you can  review it. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;there's also Concentrated Neuro- Symbolic design but I will be honest that section was harder for me to completely grasp. the  introductory idea is combining neural networks( good at perception and pattern matching, bad at being  resolvable) with emblematic   sense( good at structured  logic, traceable). the paper argues you need both. which connects to  commodity we  bandied beforehand in the semester about whether deep  literacy alone is sufficient or whether you need emblematic   factors on top — and the answer these papers  feel to give is  principally" you need both and then is how to combine them." &lt;/p&gt;




&lt;p&gt;hunt agents specifically — this is where the alternate paper comes in &lt;/p&gt;

&lt;p&gt;okay so the arXiv paper is specifically about what they call" deep hunt agents" which are a technical type of agentic system  concentrated on information  reclamation. and not just like, googling  commodity — these agents control the entire  reclamation process. web hunt, private databases, internal memory, all of it. they decide what to search for, read the results, decide what to search for coming grounded on what they  set up, and keep going until they have actually answered the question  duly. &lt;/p&gt;

&lt;p&gt;the paper breaks down hunt into three structures and this part connected really directly to information  reclamation  generalities from class &lt;/p&gt;

&lt;p&gt;resemblant hunt you  putrefy the query into multiplesub-queries and run them all at the same time. good for breadth and  effectiveness. &lt;/p&gt;

&lt;p&gt;successional/ iterative hunt you run a  circle. hunt, read what you got, reflect, decide what to search next. the coming hunt depends on what the  former one returned. this is actually how I  probe  effects when I actually  watch about the answer. &lt;/p&gt;

&lt;p&gt;mongrel tree or graph- grounded — this bone&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         is the most complex. the agent can explore multiple hunt paths, annul if  commodity is not working, and revise its whole strategymid-task. countermanding is  commodity we have seen in AI hunt algorithms( like in the hunt and problem  working unit) and seeing it applied to information  reclamation made it click for me as a natural extension of the same idea. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;p&gt;training and optimization the specialized stuff &lt;/p&gt;

&lt;p&gt;this is the section I had to read  doubly. the arXiv paper goes into methodology  duly which I appreciated indeed though it was  thick. &lt;/p&gt;

&lt;p&gt;the  introductory training channel is Supervised Fine- Tuning first — you show the model  exemplifications of good  logic and search circles so it learns what" doing it right" looks like.  also underpinning Learning on top of that, specifically  commodity called RLVR( underpinning literacy with empirical prices) where the agent gets  price signals grounded on whether its  labors are actually correct and  empirical  rather than just  presumptive- sounding. &lt;/p&gt;

&lt;p&gt;the  price functions aremulti-objective which means they are balancing multiple  effects at  formerly — answer correctness, how effective the  reclamation was, quality of  substantiation, and penalties for  spare  quests or  labors that are longer than they need to be. that last bone&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         is  intriguing to me because it means the system is being trained to be  terse which is n't  commodity I anticipated. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;there's also this conception called  spanning up test- time hunt — giving the model more computational  coffers at conclusion time( when it's actually being used) rather than just at training time. the argument is that  further thinking time at the moment of use can ameliorate  logic quality. this feels counterintuitive coming from the standard" bigger training =  better model"  supposition but the paper makes a reasonable case for it. &lt;/p&gt;

&lt;p&gt;marks used include FRAMES, GAIA, and HotpotQA.  delicacy in technical settings reportedly goes  over 94 which sounds great but the paper is enough honest that this does not always transfer to messier real world  surroundings. &lt;/p&gt;




&lt;p&gt;operations &lt;/p&gt;

&lt;p&gt;healthcare, finance, legal  exploration, automated reporting the  operations section is broad and actually I skimmed  corridor of it because it started to feel like a list. the bone&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;         that stood out to me was Deep Research where an agent runs an extended  independent  exploration session across multiple sources and produces a full report at the end. I have actually used this  point in a couple of AI tools  ahead and understanding the armature underneath it now( the iterative hunt  circles, the tool selection, the  price shaping during training) makes it feel less like a black box. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;there's also this thing where agents use hunt to ameliorate their own internal capabilities — navigating memory,  opting  the right tools,  reacquiring  once  gests  to reason more in the future. it's kind of recursive in a way that I am not sure I completely understand yet but the conception is  intriguing. &lt;/p&gt;




&lt;p&gt;challenges and limitations &lt;/p&gt;

&lt;p&gt;both papers  devote decent space to what is still broken and I  suppose this section is actually important to not skip. &lt;/p&gt;

&lt;p&gt;the fineness problem — these agents perform well in controlled settings and  also degrade when  commodity  unanticipated happens. this is a real deployment problem. like you can not put a system in a sanitarium or a bank that works 94 of the time in the lab and  also falls  piecemeal when the data looks slightly different from what it was trained on. &lt;/p&gt;

&lt;p&gt;responsibility is still  authentically unresolved.However, fraudulent  fiscal action, whatever — who's responsible, if an  independent agent makes a bad decision — wrong medical recommendation. the  inventor? the company planting it? the  stoner who set the  thing? the paper raises this and does not completely answer it because the field has n't. this connects to the AI ethics stuff we have covered and is actually one of those questions that makes me  suppose the specialized progress is ahead of the governance progress by a significant  periphery. &lt;/p&gt;

&lt;p&gt;security —  inimical attacks, data poisoning, unverifiable tool use. these are real vulnerabilities that get more serious as agents get  further  independent. not academic . &lt;/p&gt;




&lt;p&gt;NotebookLM experience — honest reflection &lt;/p&gt;

&lt;p&gt;okay so I loaded both papers into NotebookLM and had it  induce a summary before I read them  duly. and the summary was fine. accurate at a  face  position, got the main  generalities right, named the  infrastructures. &lt;/p&gt;

&lt;p&gt;but then is the thing I actually noticed. the summary treated both papers as  principally the same thing — like one unified document about agentic AI. and they are not. the MDPI paper is doing broad taxonomic review work, the arXiv paper is doing focused specialized  check work on a specific subtype of system. that distinction matters for understanding what each paper is actually contributing. the LLM summary  smoothed it. &lt;/p&gt;

&lt;p&gt;the other thing is that summaries give everything roughly equal weight. the training methodology section in the arXiv paper — the RLVR approach, the test- time  cipher scaling, themulti-objective  price functions got  epitomized in like two  rulings. but that section is actually one of the  further technically significant  corridor of the paper for understanding why these systems perform the way they do. you'd  noway  know that from the summary. &lt;/p&gt;

&lt;p&gt;so my honest take is that using the LLM summary as a starting point was useful for  exposure — I knew what  motifs were coming before I hit them in the  factual  textbook. but it was not a  relief for reading because it could not tell me which  corridor actually  signified versus which  corridor were background  environment. that judgment still  needed going back to the source. the chart is n't the  home and all that. &lt;/p&gt;




&lt;p&gt;final  studies &lt;/p&gt;

&lt;p&gt;I  suppose the thing that stuck with me from both papers is that the gap between" AI that answers questions" and" AI that pursues  pretensions" is bigger than it sounds. technically, conceptually, and in terms of what it means for responsibility and safety. the course material gave me the vocabulary to actually engage with what these papers are describing — agent  infrastructures, search structures,multi-agent collaboration, the ethics of  independent systems — which made reading them feel less like decoding and more like connecting blotches. &lt;/p&gt;

&lt;p&gt;still  suppose the responsibility question is going to be the hardest one to  break. the specialized problems feel  soluble with  further  exploration. the governance problems feel like they bear  commodity  further than  exploration. &lt;/p&gt;

&lt;p&gt;anyway that is  presumably enough. it's late. &lt;/p&gt;




&lt;p&gt;Mention: &lt;a class="mentioned-user" href="https://dev.to/raqeeb_26"&gt;@raqeeb_26&lt;/a&gt;  &lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>learning</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
