Piyush Yenorkar

Posted on Jul 1

How We Built an AI That Remembers Your Team's Failures — Before They Happen Again

#ai #webdev #hackathon #productivity

Built at HackHazards '26 · FlowMind

The Problem Nobody Talks About

Every semester, student teams repeat the same mistakes.

Wrong person assigned to the wrong task. Decisions made in week one forgotten by week three. One person carrying 80% of the work while nobody notices until submission day.

We tried Trello. We tried Notion. We tried Jira.

They all had the same fundamental flaw — they record what happened. None of them learn from it.

So we built FlowMind.

What FlowMind Actually Is

FlowMind is an AI-powered group project manager built on persistent memory. The one-line definition we kept coming back to:

"FlowMind is an AI project manager that learns your team and predicts failures before they happen."

Every other tool starts fresh every project. FlowMind's memory compounds. The longer your team uses it, the smarter it gets.

The Tech Stack

Layer	Technology
Frontend	React 18 + Vite
Backend	Node.js + Express
AI / LLM	Groq API (llama3-70b-8192)
Persistent Memory	Hindsight by Vectorize
Knowledge Graph	Neo4j
Voice Transcription	Web Speech API
Deployment	Vercel + Render

How We Used Each Partner Track

Hindsight by Vectorize — The Memory Layer

Hindsight gives AI agents three core primitives: retain(), recall(), and reflect(). Every intelligent feature in FlowMind maps to one of these.

retain() — Every task completion, decision, and meeting gets stored:

await hindsight.retain({
  key: `task_${task.id}`,
  value: {
    assignedTo: task.assignedTo,
    estimatedHours: task.estimatedHours,
    actualHours: task.actualHours,
    completedOnTime: task.completedOnTime,
    taskType: task.taskType,
  },
  tags: ['task', 'performance', task.assignedTo]
})

reflect() — Generating pre-failure warnings from memory patterns:

const warning = await hindsight.reflect({
  query: `Flag delay risks based on this team's past patterns`,
  memoryBank: `team_${team.id}`,
  includeObservations: true
})
// Returns: "72% delay risk — backend tasks took 2x estimate in past cycles"

The most powerful part was Observation Consolidation — Hindsight automatically synthesizes raw retained facts into behavioral insights. We planned to write 200 lines of custom pattern-detection logic. We found auto-consolidation and deleted all of it. The system learned our team's patterns on its own.

Neo4j — The Knowledge Graph

We used Neo4j to model team relationships as a graph. Members, skills, tasks, and outcomes all exist as nodes connected by edges.

// Create member-skill relationships
MATCH (m:Member {name: "Piyush"})
MATCH (s:Skill {name: "React"})
CREATE (m)-[:HAS_SKILL {level: "advanced"}]->(s)

// Find best person for a frontend task
MATCH (m:Member)-[:HAS_SKILL]->(s:Skill)
WHERE s.name IN ["React", "CSS", "JavaScript"]
AND (m)-[:COMPLETED]->(:Task {type: "frontend", onTime: true})
RETURN m.name, count(*) as score
ORDER BY score DESC

This is what makes FlowMind's task assignment genuinely intelligent. When the AI assigns a task to someone, it's not guessing — it's traversing a graph of real team relationships and past performance data.

Before Neo4j, assignment reasons were generic. After Neo4j:
"Assigned to Piyush — 6 frontend tasks completed, 90% on-time rate, React and Vite in skill graph."

That's the difference between a chatbot and an actual AI project manager.

Groq — The AI Brain

We used Groq's llama3-70b-8192 for three things:

Meeting Analysis — After a voice meeting ends, the full transcript + member skill profiles + Neo4j team graph gets sent to Groq. It returns structured JSON with extracted tasks, assignments with reasoning, decisions, and follow-up items. Response time: under 3 seconds.
AI Chat — Full conversational access to team memory. Leader asks "why did last sprint fail?" — Groq reads Hindsight memory context and answers from real team history.
Conflict Predictor — Groq analyzes Hindsight observation consolidations and Neo4j patterns together to generate percentage delay risks per task before deadlines break.

const response = await fetch(
  "https://api.groq.com/openai/v1/chat/completions",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.GROQ_API_KEY}`
    },
    body: JSON.stringify({
      model: "llama3-70b-8192",
      messages: [
        { role: "system", content: systemPrompt },
        { role: "user", content: userPrompt }
      ],
      temperature: 0.3
    })
  }
)

The Core Features We Built

🎙️ AI Voice Meetings
Leader starts a meeting — FlowMind AI joins as a participant, transcribes speech in real time via Web Speech API, and when the meeting ends, Groq analyzes the transcript and auto-extracts tasks assigned by skill profile. Tasks appear on the board automatically.

⚠️ Conflict Predictor
Reads Hindsight memory patterns and Neo4j relationship graph to flag delay risks per task before deadlines break. Not generic rules — actual team history powering the prediction.

🧠 Decision Log
Every team decision stored with full context in Hindsight memory. AI recalls these to prevent repeated arguments in future meetings.

💬 AI Chat
Full conversational access to complete team memory. Answers "what did we decide about X?" instantly from Hindsight recall.

📊 AI Insights
Surfaces behavioral patterns: peak productivity windows, recurring bottlenecks, estimation biases — all derived from Hindsight observation consolidation and Neo4j graph traversal.

👤 Member Skill Profiles
Members build profiles with skills, past experience, availability, and preferred task types. These get stored in both Hindsight memory and Neo4j. AI cross-references profiles against team outcomes to make assignments smarter over time.

The Implementation Journey

What We Got Right

Multi-bank memory architecture was the right call.
Separating per-user memory from shared team memory in Hindsight let the AI cross-reference individual skill profiles against team-wide outcomes. That cross-reference is what makes assignment reasoning genuinely useful rather than generic.

Groq's speed changed what was possible.
At under 3 seconds for full meeting analysis, the feature feels instant. If we had used a slower model, the voice meeting → task extraction flow would feel broken. Speed is a UX feature.

Neo4j + Hindsight together is more powerful than either alone.
Hindsight stores temporal patterns — what happened over time. Neo4j stores relationship patterns — who is connected to what. When both feed into Groq's context, the AI has both dimensions. That combination is what makes FlowMind's predictions feel eerily accurate.

What We Got Wrong

Cold start is a real problem.
FlowMind is dramatically better after 2-3 projects. Week one predictions are weak because there's nothing in memory yet. We should have built a smarter onboarding flow that pre-seeds memory with team information before the first project starts.

Web Speech API is inconsistent.
Chrome works reliably. Safari doesn't. Firefox has partial support. We ended up using manual transcript input as fallback more than we expected. The lesson: always build the fallback first, make it feel intentional, not like a workaround.

We underestimated the schema design for Neo4j.
Getting the node and relationship types right for the knowledge graph took longer than building any feature. The graph schema is the hardest part of the Neo4j integration — plan it on paper before writing a single Cypher query.

The Moment That Made It Real

We were testing the voice meeting feature. Pasted in a transcript from a real college project meeting we'd had the week before.

3 seconds later, FlowMind said:

"Assigned to Piyush — React listed in skill profile, 3 similar frontend tasks completed on time, prefers frontend work. Assigned to Debashree — research and documentation in past work experience, preferred task types include research."

It read the room. From a transcript. Using memory it had built over previous sessions.

That's when we stopped thinking of FlowMind as a hackathon project and started thinking of it as something real.

What's Next

Sarvam AI integration for multilingual meeting transcription — Indian teams don't always speak English in meetings
Expo mobile app for members — phone notification when a meeting assigns you a task
Automated sprint retrospectives — AI generates a weekly team performance report from Hindsight memory

Try FlowMind

🔗 Live Demo: https://flowwithmind.vercel.app/
📁 GitHub: https://github.com/piyushyenorkar/FlowMind

Built at HackHazards '26 by Piyush Yenorkar and Debashree Mal
Theme: Human Experience & Productivity
Tracks: Neo4j · Render · Base44 · Sarvam · Expo

Top comments (1)

Luis Cruz • Jul 1

This post is basically a practical implementation of something a lot of AI systems claim but rarely do well: turning “memory” into operational decision-making for teams.

The core idea is an AI that doesn’t just store incidents or logs, but actively learns from team failures and reuses that history to prevent repetition. That aligns with a growing pattern in agent systems where memory is treated as structured “experience” rather than passive retrieval storage. Similar systems use retain/recall loops or persistent event stores to build institutional knowledge over time.

What makes this interesting is the shift in abstraction:

Instead of:

“What happened in the past?”

It becomes:

“What failure patterns repeat across this team?”
“What conditions predict a likely mistake?”
“What should the system block or warn about before execution?”

That moves the AI from observational memory → preventive intelligence.

The real challenge, though, is not storing failures—it’s avoiding bias amplification. If the system overweights past mistakes, it can start suppressing valid experimentation or misclassify edge cases as risk.

Still, the direction is strong: combining structured incident history, team context, and retrieval-based reasoning is one of the most practical paths toward real “organizational intelligence” in AI systems.