DEV Community

Santiago Palma
Santiago Palma

Posted on

How I Built a Graph-Based Team Formation System That Detects Organizational Linchpins

A deep dive into using Neo4j, Beam Search, and Betweenness Centrality for intelligent team assembly


The Problem: The Bus Factor Crisis

Every software team lives with an invisible risk: the Bus Factor—the minimum number of people who, if they left tomorrow, would bring your project to its knees.

The research is sobering:

  • 50% of open-source projects have Bus Factor ≤ 2 (Avelino et al., 2016)
  • Developer turnover increases defects by 40-60% (Foucault et al., 2015)

Traditional HR systems track skills but miss structural dependencies. Someone might be a "communication bridge" between the frontend and infrastructure teams without appearing in any report. When they leave, two teams that used to collaborate seamlessly suddenly can't talk to each other.

I built SmartChimera to solve this: a graph-based system that detects organizational linchpins and forms resilient teams. Here's how.


Architecture Overview

SmartChimera is a full-stack application built with Neo4j, FastAPI, and React. The architecture follows a modular design where each component has a single responsibility:

The system runs as containerized services with Docker Compose:

# docker-compose.yml
services:
  neo4j:
    image: neo4j:5.14.0
    ports:
      - "7474:7474"  # Browser UI
      - "7687:7687"  # Bolt protocol

  backend:
    build: ./backend
    depends_on:
      neo4j:
        condition: service_healthy
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000

  frontend:
    build:
      context: ./frontend
      dockerfile: Dockerfile.dev
    ports:
      - "5173:5173"
Enter fullscreen mode Exit fullscreen mode

Challenge 1: Detecting Organizational Linchpins

The Problem

Traditional HR systems track skills but miss structural dependencies. Someone might be a "communication bridge" between teams without appearing on any org chart. When they leave, two teams that used to collaborate seamlessly suddenly can't.

The Solution: Hybrid Risk Metric

We combine two complementary signals:

  1. Betweenness Centrality (BC) — Network topology: who bridges teams
  2. Project Weight (PW) — Workload concentration: who's overloaded
# linchpin_detector.py - REAL CODE
def compute_combined_risk_score(self) -> Dict[str, float]:
    """
    Risk(v) = α · BC_normalized(v) + β · PW(v)/max(PW)
    α = β = 0.5 for balanced detection
    """
    network_bc = self.compute_betweenness_centrality()  # Brandes via NetworkX
    project_scores = self._compute_project_dependency_score()

    final_scores = {}
    all_ids = set(network_bc.keys()) | set(project_scores.keys())

    for eid in all_ids:
        net_score = network_bc.get(eid, 0.0)
        proj_score = project_scores.get(eid, 0.0)
        # Weighted unification: 50% Network, 50% Project Weight
        final_scores[eid] = (net_score * 0.5) + (proj_score * 0.5)

    return final_scores
Enter fullscreen mode Exit fullscreen mode

The Betweenness Centrality computation uses the Brandes algorithm via NetworkX, which runs in O(VE) time—significantly better than the naive O(V³) approach:

def compute_betweenness_centrality(self) -> Dict[str, float]:
    """Compute Brandes BC combined with synthetic BC scores."""
    import networkx as nx
    G = nx.Graph()

    with self.driver.session() as s:
        for r in s.run("MATCH (e:Empleado) RETURN e.id as id"):
            G.add_node(r['id'])
        for r in s.run("MATCH (a:Empleado)-[:TRABAJO_CON]-(b:Empleado) "
                       "WHERE a.id < b.id RETURN a.id as s, b.id as d"):
            G.add_edge(r['s'], r['d'])

    brandes_bc = nx.betweenness_centrality(G, normalized=True)
    # Combine with pre-computed synthetic BC (use max of both)
    ...
    return self._bc_cache
Enter fullscreen mode Exit fullscreen mode

Why This Matters

This approach catches TWO types of linchpins:

  • Social Hubs: High BC, low projects — communication bottlenecks
  • Workhorses: Low BC, high projects — overloaded specialists

Both are organizational risks, but they require different mitigation strategies.


Challenge 2: Beam Search with Multi-Objective Optimization

The Problem

Forming an optimal team from N candidates is NP-hard. If you have 100 candidates and need a team of 5, that's C(100,5) = 75+ million combinations to evaluate. Exhaustive search simply doesn't scale.

The Solution: Beam Search

We maintain the top-W partial solutions at each step, pruning aggressively. This gives us O(k × n × W) complexity instead of exponential—polynomial time for a real-world approximation.

# smart_team_formation.py - REAL CODE
BEAM_WIDTH = 10

# State: (team_list, team_ids, covered_skills, score)
beam = [([], set(), set(), 0.0)]

for step in range(k):  # k = team size
    candidates_pool = []

    for (curr_team, curr_ids, curr_covered, curr_score) in beam:
        for candidate in candidates:
            if candidate['id'] in curr_ids:
                continue

            c_skills = get_skills(candidate)

            # Multi-objective scoring
            coverage_score = len(new_skills) * weights['skill_coverage']
            depth_score = get_depth(candidate, required) * weights['skill_depth']
            collab_score = get_collab_edges(driver, candidate['id'], curr_ids) * weights['collaboration']
            redundancy_score = len(overlap_skills) * weights['redundancy']
            bc_penalty = bc_score * weights['bc_penalty']  # Penalize linchpins!

            total = coverage_score + depth_score + collab_score + redundancy_score - bc_penalty
            candidates_pool.append((new_team, new_ids, new_covered, total))

    # Prune to top-W
    candidates_pool.sort(key=lambda x: x[3], reverse=True)
    beam = candidates_pool[:BEAM_WIDTH]
Enter fullscreen mode Exit fullscreen mode

The Result

Our benchmarks show:

  • Beam width 10 achieves ~98% of optimal quality with significant speedup
  • Response time: <500ms per recommendation on 150-node graphs
  • Memory: Constant O(W × k) space regardless of candidate pool size

Challenge 3: Context-Aware Formation with Mission Profiles

The Problem

A team for "legacy maintenance" needs completely different traits than one for "R&D innovation". One-size-fits-all doesn't work in the real world.

The Solution: 9 Configurable Mission Profiles

Each profile adjusts the weight coefficients in our multi-objective function:

# mission_profiles.py - REAL CODE
MISSION_PROFILES = {
    'mantenimiento': {
        'name': 'Mantenimiento Crítico (Resilient)',
        'description': 'Maximum stability. Penalizes risk and demands redundancy.',
        'weights': {
            'skill_coverage': 2.0,
            'skill_depth': 1.0,      # Stability > Brilliance
            'collaboration': 2.0,
            'redundancy': 5.0,       # CRITICAL: Must have backups
            'bc_penalty': 20.0       # VETO: No Linchpins allowed
        }
    },
    'innovacion': {
        'name': 'I+D / Deep Tech (Growth)',
        'description': 'Prioritize technical geniuses. Accept Bus Factor risk.',
        'weights': {
            'skill_coverage': 1.0,
            'skill_depth': 10.0,     # CRITICAL: Only experts
            'collaboration': 0.5,
            'redundancy': 0.0,
            'bc_penalty': -5.0       # BONUS: We WANT linchpins (they're the experts!)
        }
    },
    'entrega_rapida': {
        'name': 'Speed Squad (Agile)',
        'description': 'Teams that already know each other. Maximize prior collaboration.',
        'weights': {
            'skill_coverage': 2.0,
            'collaboration': 10.0,   # CRITICAL: Must have worked together
            'availability': 4.0,     # Must be free NOW
            'bc_penalty': 0.0
        }
    },
    # ... 6 more profiles including:
    # - legacy_rescue (SRE mode)
    # - junior_training (Skill development)
    # - crisis_response (Firefighting)
    # - architecture_review (Seek linchpins for strategic decisions)
    # - security_audit (Maximum paranoia and redundancy)
    # - cloud_migration (Broad technology coverage)
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters

The same algorithm produces completely different teams based on strategic context:

  • Maintenance mode: Stable, redundant, no single points of failure
  • Innovation mode: Expert-heavy, accepts risk for maximum capability
  • Speed Squad: Prioritizes teams with prior collaboration history

Notice how the bc_penalty weight can be negative: for architecture reviews, we deliberately seek linchpins because they hold institutional knowledge. Sometimes you want your best people on the job, risk be damned.


Lessons Learned

1. Validate Your Metrics Rigorously

Our initial Bus Factor metric was calculated as 1 - avg_BC. When the algorithm itself minimizes BC, this became a tautology: we were measuring success by the very thing we optimized!

We caught this through rigorous statistical validation with N=500 Monte Carlo simulations. The lesson: always have independent validation metrics that measure outcomes your algorithm doesn't directly optimize.

2. Graph Databases Enable New Questions

With Neo4j, queries like "who bridges the frontend and backend teams?" become a single Cypher query:

MATCH (a:Empleado)-[:PERTENECE]->(t1:Team {name: 'frontend'})
MATCH (b:Empleado)-[:PERTENECE]->(t2:Team {name: 'backend'})
MATCH path = (a)-[:TRABAJO_CON*..3]-(b)
WITH nodes(path) as bridge_candidates
UNWIND bridge_candidates as person
RETURN person.nombre, count(*) as bridge_frequency
ORDER BY bridge_frequency DESC
Enter fullscreen mode Exit fullscreen mode

Try doing that with join-heavy SQL. Graph structures make relationship-centric queries trivial.

3. Heuristics Beat ML for Transparency

We could have trained a neural network to predict "good teams." But HR decisions require explainability. Managers need to understand why a team was recommended.

Beam Search with explicit weights gives us full auditability: "This candidate was selected because they add 2 new skills, have worked with 3 team members before, and have low Bus Factor risk." That's a conversation you can have with a VP. A neural network output isn't.


Results

Metric Value
Dataset 150-node organizational graph
Response time <500ms per recommendation
Validation N=500 Monte Carlo simulations
Algorithm Beam Search O(k × n × W)
Mission Profiles 9 configurable strategies
Stack Neo4j + FastAPI + React + TypeScript
Recognition 🏆 2nd Place - UNSA Engineering Project Fair 2025

What's Next?

SmartChimera is available on GitHub. Future plans include:

  • Reinforcement Learning for automatic weight optimization based on team outcomes
  • Temporal graphs to model evolving collaboration patterns over time
  • Privacy-preserving federated analysis for multi-organization deployment

If you're building HR tech with graph algorithms, working on organizational analytics, or just interested in applied graph theory—let's connect!


Santiago Palma — Universidad Nacional de San Agustín, Perú

Tags: #graphs #neo4j #python #fastapi #react #algorithms #opensource #hrtech


📚 Research References:

  • Avelino, G. et al. (2016). "A Novel Approach for Estimating Truck Factors" - ICPC
  • Foucault, M. et al. (2015). "Impact of Developer Turnover on Quality" - FSE
  • Brandes, U. (2001). "A Faster Algorithm for Betweenness Centrality" - Journal of Mathematical Sociology

Top comments (0)