DEV Community

Cover image for Adaptive-weighted A* improves pathfinding with heuristic rewards; survey reviews LLM search agents, optimization, evaluation.
24P-0574 Taha Waseem
24P-0574 Taha Waseem

Posted on

Adaptive-weighted A* improves pathfinding with heuristic rewards; survey reviews LLM search agents, optimization, evaluation.

## LLM-Based Deep Search Agents

The first paper provides a systematic investigation of Search Agents, described as independent LLM-based entities adept in planning, conducting multi-turn retrievals, and integrating information from multiple sources.

Search has changed from traditional keyword matching to agentic search, where models proactively control the seeking process through adaptive reasoning instead of following fixed rules.

**

  • Search Structures: ** The authors classify agent search into parallel (concurrent sub-queries), sequential (decision-making informed by preceding steps), and hybrid (tree or graph-based exploration) frameworks.

- Optimization and Learning:
Agents are optimized using either tuning-free (multi-agent workflows) or tuning-based techniques, like Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), to internalize complicated search paths.

**

  • Next Steps: ** Some of the challenges are expanding the search to include private or proprietary data, moving toward multimodal search (images and videos), and getting agents to really evolve on their own by improving their search strategies.

**

Path-Planning Algorithm*

**

The second paper focuses on optimizing the algorithm* for autonomous vehicles to address long running times, redundant search nodes, and paths that collide with obstacle corners.

**

- Technical Innovations:

**

**

  • Diagonal-free Five-way Search: ** A new search method that filters out diagonal movements near obstacles to improve safety and avoid corner collisions.

**

  • Adaptive Dynamic Weighting: ** Utilises a radial basis function to adjust the heuristic function’s weights based on the distance from the starting point, significantly reducing the number of nodes explored.

**

  • Heuristic Reward Values: ** Adds target-point reward values to the cost function to help the algorithm escape "local optima" (getting stuck behind long obstacles).

**

- Performance Gains:

** Experimental results showed the improved algorithm reduced search nodes by 76.4% and turn angles by 71.7% on average while shortening planning time.

**

- Refinement:

** The paper introduces a secondary optimization to remove redundant points and uses Bessel curves to smooth the final path for vehicle traversal.

**_

Combined Insights

_**

While the first paper deals with seeking information and the second with seeking a physical route, they share core themes in the evolution of search technology:

**

  • Dynamic Adaptation: ** Both systems move away from static parameters—LLM agents use "Dynamic Planning," while the A* algorithm uses "Adaptive Weights" to change behavior based on current progress.

**

  • Search-Centric Scaling: ** Both papers emphasize that increasing computation or search turns ("Scaling up test-time search") leads to superior outcomes, whether it's a more accurate research report or a safer vehicle path.

**

  • Complexity Management: ** Both address the need to reduce "noise"—LLM agents must filter irrelevant web data, while the A* algorithm must eliminate redundant nodes and hazardous paths.

Top comments (0)