DEV Community

Cover image for From Algorithms to Agents: How My Research in Clustering Shapes My Automation Logic
Dhiraj Das
Dhiraj Das

Posted on • Originally published at dhirajdas.dev

From Algorithms to Agents: How My Research in Clustering Shapes My Automation Logic

🎯

Key Insights

  • Research Shapes Thinking: Academic principles influence how I approach automation problems
  • Noise = Flakiness: The same mental model for filtering spatial noise applies to test stability
  • Efficiency Matters: Algorithmic thinking drives optimization in wait strategies and element selection
  • Pattern Recognition: The mindset of "finding order in chaos" applies to self-healing frameworks

*"Just click this, type that, check this."*

If this is how you think about test automation, you're building a house of cards.

Most automation engineers focus on actions—what to click, what to type, what to assert. But treating automation as just a sequence of actions leads to brittle scripts that shatter the moment the UI changes a single class name.

I don't look at automation as scripting. I look at it as a data problem.

Why? Because long before I was building automation frameworks in Python, my co-author Hrishav and I were researching algorithmic efficiency in spatial databases. That research—published in 2012—didn't teach me a specific technique to copy-paste into Selenium. But it fundamentally shaped *how I think* about complex data problems.

🔬

The Foundation: The TDCT Algorithm

Back in 2012, Hrishav and I co-authored a research paper titled *"A Density Based Clustering Technique For Large Spatial Data Using Polygon Approach"* (TDCT).

The Problem We Solved

How do you find meaningful patterns (clusters) in massive, chaotic datasets—without getting overwhelmed by noise?

Existing algorithms like DBSCAN were good, but they struggled with:

  • Arbitrary shapes: Real-world data doesn't form neat circles
  • Computational cost: Scanning every point against every other point doesn't scale
  • Noise sensitivity: Outliers distorted the cluster boundaries

Our Solution: Triangular Density

Instead of using circular neighborhoods (like DBSCAN), we mapped data points into triangular polygons. This allowed us to:

  • Calculate density more efficiently than radial scans
  • Detect clusters of arbitrary, non-convex shapes
  • Isolate noise points without corrupting core clusters

Key Insight

Key Insight

By changing the *geometry* of the problem (circles → triangles), we fundamentally reduced computational complexity while *improving* cluster detection accuracy. This collaborative work laid the foundation for how I approach complex data problems to this day.

🌉

The Bridge: Why This Matters for Quality Engineering

*"Dhiraj, what does spatial clustering have to do with Selenium?"*

Not the code—the mindset.

Research Mindset Automation Application
Noise obscures real patterns Flaky tests obscure real bugs
Brute-force scanning doesn't scale Linear polling and hard sleeps don't scale
Geometry matters for efficiency The structure of your framework determines its resilience
Identify stable cores vs. noise Distinguish reliable element attributes from dynamic ones

It's About Problem Framing

When I encounter a complex automation challenge, I don't immediately think "what Selenium command do I need?" I think:

  • What's the data structure here? (The DOM is a tree, test results are time-series data)
  • What's the noise vs. the signal? (Which element attributes are stable? Which failures are true bugs?)
  • How can I reduce complexity? (Can I optimize the problem's "geometry" like TDCT did?)

This mental model—trained by years of algorithmic research—influences every framework decision I make.

💡

Applying the Mindset: Practical Examples

Example 1: Multi-Attribute Element Location with Fallback Logic

Brute-Force Approach (like naive spatial scanning—single point of failure):

# If ID changes, everything breaks
element = driver.find_element(By.ID, "checkout-btn-v3")
Enter fullscreen mode Exit fullscreen mode

Algorithmic Approach (like TDCT's density-core identification—multiple data points):

def find_element_with_fallback(driver, strategies: list[tuple]) -> WebElement:
    """
    Instead of relying on one brittle locator, we analyze multiple
    'data points' (attributes) ordered by reliability/stability.
    Like TDCT identifies cluster cores by density, we identify
    elements by attribute stability.
    """
    for strategy, locator in strategies:
        try:
            element = WebDriverWait(driver, 2).until(
                EC.element_to_be_clickable((strategy, locator))
            )
            return element
        except TimeoutException:
            continue
    raise NoSuchElementException("All strategies exhausted")

# Define strategies ordered by stability (most stable first)
checkout_strategies = [
    (By.CSS_SELECTOR, "[data-testid='checkout']"),  # Most stable: test IDs
    (By.CSS_SELECTOR, "button[aria-label='Checkout']"),  # Accessibility attrs
    (By.XPATH, "//button[contains(text(), 'Checkout')]"),  # Text content
    (By.CSS_SELECTOR, ".cart-section button.primary"),  # Structural fallback
]

checkout_btn = find_element_with_fallback(driver, checkout_strategies)
Enter fullscreen mode Exit fullscreen mode

This reflects the TDCT mindset: instead of relying on a single identifier (like a single spatial coordinate), we cluster multiple attributes by reliability and select the highest-confidence match. The first strategy is our "density core"—if it fails, we gracefully fall back to less stable but still valid "neighbors."

Example 2: Self-Healing Element Location

Static Approach (brittle, noise-sensitive):

driver.find_element(By.ID, "submit-btn-v3")  # Breaks when ID changes
Enter fullscreen mode Exit fullscreen mode

Adaptive Approach (cluster-like resilience):

# When an element isn't found, analyze multiple attributes:
# - Text content (stable?)
# - Class names (which are consistent?)
# - Position relative to stable anchors
# Then select the "highest confidence" match
Enter fullscreen mode Exit fullscreen mode

This isn't literally running TDCT. But the *thinking* is the same: instead of relying on a single brittle identifier, we analyze multiple "data points" (attributes) to find the most stable combination.

🛠️

The Tools I Build Reflect This Philosophy

When I created packages like Lumos ShadowDOM or Visual Guard, I wasn't consciously implementing clustering algorithms. But the design decisions reflect the same principles:

  • Traversing Shadow DOM efficiently → Understanding the *structure* of the problem before brute-forcing
  • Visual regression with SSIM → Using mathematical models (not pixel-by-pixel noise) to find meaningful differences
  • Self-healing in my frameworks → Treating element attributes as "data points" with varying reliability

The research doesn't give me copy-paste solutions. It gives me a lens for seeing automation as a data problem, not a scripting problem.

🎯

Conclusion: Automation Isn't Just Code—It's Logic

Whether it's the TDCT algorithm Hrishav and I published years ago or the automation tools and libraries I build today, the goal remains the same:

The Goal

The Goal

Bringing order to chaos.

The DOM is chaotic. Test data is chaotic. UI changes are chaotic.

But with the right algorithmic mindset—trained by research in one domain—we can bring that discipline to another domain entirely.

Read the Original Research

📄 A Density Based Clustering Technique For Large Spatial Data Using Polygon Approach (TDCT) — Published on ResearchGate

The Takeaway

The Takeaway

*The best automation engineers aren't just coders. They're problem solvers who see data structures where others see buttons.*

Top comments (0)