DEV Community: Waqar Akhtar

Threadly: The Missing Memory Layer for Professional Relationships

Waqar Akhtar — Tue, 14 Jul 2026 04:53:48 +0000

Every breakthrough in your career starts with a conversation.

A conversation with a founder at a hackathon.

A mentor who offered to review your portfolio.

An engineer who said,

"Reach out when you're applying."

A researcher who shared an idea that changed the way you think.

The problem is rarely meeting these people.

The problem is remembering them.

Not just their names.

The context.

The promise.

The opportunity.

We have incredible tools for finding people. LinkedIn connects us to millions of professionals. Contacts apps store phone numbers. Email archives preserve every message we've ever sent.

Yet none of them answer the questions that actually matter.

Who was the founder I met at the AI Summit who wanted to collaborate?

Who did I promise to follow up with after the hackathon?

What does my professional network actually tell me about my career direction?

Professional relationships have become fragmented across LinkedIn, emails, business cards, calendars, notes, and memory.

Memory doesn't scale.

That's why I built Threadly.

Contacts Are Records. Relationships Are Graphs.

Traditional software stores people like rows in a spreadsheet.

A relationship doesn't behave like a spreadsheet.

Every relationship has context.

You met someone somewhere.

You talked about something.

They work somewhere.

You promised something.

That conversation connects to another person, another company, another event.

Relationships are naturally connected.

So instead of building another contact manager, I built a relationship intelligence platform powered by a knowledge graph.

Every interaction becomes part of a living memory.

Not just who you know.

But how you know them.

The Experience Had To Feel Natural

One design principle guided every decision.

People should never have to fill forms after networking.

Imagine finishing a conference.

Instead of opening a CRM and typing twenty different fields, you simply write:

"Met Arjun Sharma at the Nasscom AI Summit. He's the founder of Krishify. We discussed rural credit and he offered to introduce me to his CTO."

That's it.

Threadly's Scanner AI extracts:

Person
Company
Role
Event
Topics
Commitments

…and immediately builds relationships inside the knowledge graph.

No manual data entry.

No tagging.

No categorization.

Just conversation.

A Graph That Thinks

This is where Neo4j AuraDB became the foundation of the project.

Instead of storing isolated contacts, Threadly models:

Users
People
Companies
Events
Topics
Interactions

as interconnected nodes.

The graph isn't there because graphs are fashionable.

It's there because professional relationships are graphs.

That single design decision changes what becomes possible.

Instead of searching notes, users can ask:

"Who do I know at Microsoft?"

Instead of scrolling through LinkedIn:

"Who should I reconnect with this month?"

Instead of manually analyzing hundreds of connections:

"What does my network tell about me?"

Those questions don't have keyword answers.

They require reasoning across relationships.

That's exactly what graphs are built for.

AI Should Understand Relationships

Capturing information was only the first half of the problem.

The second half was understanding it.

Threadly introduces specialized AI agents, each responsible for a single task.

Scanner

Transforms conversations into structured relationships.

Recall

Explores the graph to rediscover forgotten connections.

Chief

Analyzes the entire network and surfaces strategic insights.

Pulse

Measures the health and diversity of professional relationships.

Nudge

Recommends who deserves a follow-up before opportunities disappear.

Each agent builds on the same shared relationship graph.

That shared memory makes the entire system smarter over time.

Technology Should Disappear

The technology stack wasn't chosen because it was trendy.

Every component solved a specific problem.

Next.js created a fast, responsive interface.
FastAPI powered a lightweight backend optimized for AI workflows.
Firebase Authentication handled secure identity management.
Neo4j AuraDB became the relationship memory layer.
Groq enabled low-latency AI reasoning.
Render and Vercel provided a simple production deployment pipeline.

Good architecture isn't about adding components.

It's about removing unnecessary ones.

Challenges Along the Way

Building AI is surprisingly easy.

Building reliable AI products is much harder.

One challenge was transforming natural language into structured data without forcing rigid templates.

People don't describe meetings consistently.

Some mention companies first.

Others mention names.

Some forget events entirely.

The extraction pipeline had to remain flexible while producing predictable outputs.

The second challenge was deciding where intelligence should live.

Initially, it was tempting to let the language model answer everything.

That quickly became expensive, unpredictable, and difficult to trust.

Instead, Threadly separates responsibilities.

The graph stores truth.

The AI reasons over truth.

That distinction made the system significantly more reliable.

The final challenge was deployment.

Integrating Firebase Authentication, Groq, Neo4j AuraDB, FastAPI, Render, and Vercel into a seamless production workflow required careful handling of authentication, environment variables, CORS policies, and cloud deployment.

None of those challenges changed the product vision.

But solving them transformed a local prototype into a production-ready application.

Why Neo4j Matters

The most valuable thing Threadly stores isn't contacts.

It's context.

Context forms relationships.

Relationships form graphs.

Graphs reveal opportunities.

Using Neo4j AuraDB allowed Threadly to move beyond traditional CRUD operations into relationship reasoning.

Every new interaction strengthens the network instead of simply adding another row to a database.

What Comes Next

Today, Threadly understands conversations.

Tomorrow, it should understand careers.

Future versions will integrate:

Calendars
Email
LinkedIn
Meeting transcripts
Messaging platforms

to continuously enrich the relationship graph without asking users to change how they work.

The long-term vision isn't another productivity tool.

It's a second brain for professional relationships.

One that remembers every conversation.

Connects every idea.

And quietly ensures that no meaningful opportunity is ever lost because memory failed.

Because in the end,

Careers aren't built by collecting contacts.

They're built by nurturing relationships.

Threadly exists to remember them, even when we can't.

Escaping the Stateless Trap: Building a Context-Aware Support Agent

Waqar Akhtar — Tue, 14 Jul 2026 04:23:20 +0000

Most people think building a support chatbot is about generating human-like text. It isn’t. The real problem is memory. Or more specifically, the complete lack of it.

I got tired of prompt engineering hacks and started looking for something better. I didn’t want a bot that sounded smart for one message and then immediately forgot everything like it hit its head on a table.

That’s how IRIS (Intelligent Recall & Issue Support) started.

Instead of building yet another chatbot, I built a multi-tenant API service that plugs into existing e-commerce systems and turns them into context-aware support agents. Because apparently, remembering what a customer said five minutes ago is still considered advanced technology.

Most bots I tested had the same fatal flaw: every interaction felt like a first date. A customer complains three times about a delayed package, and the bot still asks for the order number again like it has goldfish-level cognition. Not ideal.

So the goal became simple: give the agent memory across sessions, not just within a single chat window.

What IRIS Does (And Why It’s Actually Useful)

IRIS is a stateful API layer built with FastAPI that sits between:

The customer chat interface
Order Management Systems (OMS)
A large language model

Its job is to route messages, manage multi-tenant authentication, and most importantly, maintain long-term state.

The system follows an integration-first approach. Businesses don’t want another dashboard. They want something that quietly works in the background without forcing their team to learn yet another UI.

So IRIS is headless. One API. Multiple tenants. Strict data isolation per brand.

The architecture is built around three core pillars:

LLM Engine

We use Groq-hosted llama3-70b-8192 for fast inference. Speed matters because the system performs multiple internal steps before responding.

Integration Layer

Connectors like Shopify and REST APIs fetch real-time order data. This prevents hallucinations like “your package is delivered” when it’s still sitting in a warehouse.

Memory Layer

Instead of stuffing entire chat histories into a vector database or relying on massive context windows, I used Hindsight for structured agent memory.

How the System Actually Works

When a message comes in:

The system queries the memory layer for:
Customer history
Behavioral profile
It fetches live order data from OMS.
It checks a global incident stream:
Are multiple users reporting the same issue?
All of this context is compiled into a structured prompt.
The LLM generates a response.
If an action is required (like a refund), it outputs structured JSON.
The backend extracts that JSON, executes the action, and removes it from the user-visible response.
The interaction is stored back into memory.

Simple idea. Annoyingly complex execution.

The Real Challenge: Memory Architecture

The hardest part wasn’t calling an LLM API. That’s easy.

The real challenge was structuring memory so the system could be:

Personalized per user
Aware of system-wide issues

Initially, I tried dumping everything into one vector store per tenant. That went about as well as you’d expect.

The agent started mixing user contexts. User A’s complaint showed up in User B’s conversation. Not great unless you enjoy chaos.

So I switched to a segmented memory model using Hindsight.

Two-Layer Memory Design

Per-Customer Banks

Each user gets their own memory space:

Communication style
Past issues
Preferences

Global Pattern Banks

Tenant-wide memory that tracks trends:

Issue spikes
System outages
Common failures

This separation completely changed the system behavior. It stopped reacting and started anticipating.

Code Walkthrough
Message Processing

backend/agent.py (simplified)

async def process_message(tenant_id: str, user_id: str, message: str):
# 1. Detect if this is a systemic issue
issue_type = detect_issue_type(message)
if issue_type:
await report_to_global_memory(tenant_id, issue_type)

    # Check if we are currently in an active incident for this issue
    active_incident = await check_active_incidents(tenant_id, issue_type)
    if active_incident:
        # Short-circuit standard troubleshooting
        context["incident_alert"] = f"Known issue: {active_incident.description}"

# 2. Recall personal history
customer_history = await memory_client.recall(
    bank_id=f"{tenant_id}_user_{user_id}",
    query=message,
    limit=5
)

# 3. Generate a quick reflection on the customer's state
profile = await memory_client.reflect(
    bank_id=f"{tenant_id}_user_{user_id}"
)

return await generate_llm_response(message, customer_history, profile, context)

The key part here is reflect. Instead of dumping raw chat logs into the LLM, we generate a summarized profile. This reduces token usage and improves accuracy.

Storing Interactions

backend/memory.py (simplified)

async def retain_interaction(tenant_id: str, user_id: str, user_msg: str, agent_response: str):
bank_id = f"{tenant_id}user{user_id}"

# Store the interaction in the customer's specific memory bank
await memory_client.retain(
    bank_id=bank_id,
    content=f"Customer: {user_msg}\nAgent: {agent_response}",
    metadata={"timestamp": get_current_time()}
)

Nothing fancy here. Just disciplined storage instead of hoping the LLM remembers things out of kindness.

Action Execution

backend/actions.py (simplified)

def extract_and_execute_action(llm_response: str, order_data: dict):
# Look for a JSON block at the end of the response
match = re.search(r'


', llm_response, re.DOTALL)
    if match:
        try:
            action_req = json.loads(match.group(1))
            if action_req.get("action") == "initiate_refund":
                execute_refund(action_req.get("order_id"))
            # Strip the JSON so the user doesn't see it
            clean_response = llm_response.replace(match.group(0), "").strip()
            return clean_response, True
        except json.JSONDecodeError:
            pass
    return llm_response, False

Instead of trusting the LLM to behave, we force it into a structured output format and clean things up afterward. Because optimism is not a system design strategy.

Results: Before vs After

Without memory:

User: Where is my replacement?
Bot: Please provide your order number.

With IRIS:

User: Where is my replacement?
IRIS: I see we initiated a replacement for order #12345 yesterday due to damage. It shipped this morning via UPS (Tracking: 1Z9999) and should arrive by Thursday.

That’s the difference between a chatbot and a support system.

Incident Detection in Action

During a simulated outage:

First few users → normal troubleshooting
After pattern detection → immediate acknowledgment

Response shifts to:

“We’re currently experiencing checkout issues. Our team is working on it. I’ll notify you once it’s resolved.”

This avoids hundreds of duplicate support tickets and reduces system load.

Lessons Learned
1. State > Intelligence

LLMs are smart, but without memory they’re basically polite amnesiacs. Memory must be a core architectural component, not an afterthought.

2. Summarize, Don’t Dump

Raw transcripts degrade performance. Reflection-based summaries improve both accuracy and efficiency.

3. Separate Responsibilities

Don’t ask the LLM to generate text and execute logic perfectly at the same time. Use structured outputs and backend validation.

4. Speed Matters

Fast inference (Groq) makes multi-step workflows viable. Slow systems can’t afford layered reasoning.

Final Thought

Building a stateful agent isn’t about crafting the perfect prompt. It’s about building the infrastructure that ensures the model always receives the right context at the right time.

Everything else is just clever wording wrapped around forgetfulness.

Escaping the Stateless Trap: Building a Context-Aware Support Agent

Waqar Akhtar — Sun, 12 Apr 2026 17:47:11 +0000

The hardest part about building an automated support system isn't generating human-like text, but getting the system to actually remember the customer. I was tired of prompt engineering and started looking for a better way to help my agent remember.

For quite sometime, I set out to build IRIS (Intelligent Recall & Issue Support). I didn't want to build just another standalone chatbot. Instead, I built a multi-tenant API service that any e-commerce business can plug into their existing tools to turn their rigid support systems into context-aware agents. Most of the off-the-shelf support bots I had evaluated suffered from the same fatal flaw: they treated every interaction like a first date. No matter how many times a customer complained about a delayed package, the bot would gleefully ask for their order number again. It was infuriating. I needed a way to give my agent memory so it could retain context across sessions, rather than just within a single chat window.

What IRIS Does and How It Hangs Together

IRIS is fundamentally a fast, stateful API layer built on FastAPI that sits between the customer chat widget, our order management systems (OMS), and an LLM. It routes messages, handles multi-tenant authentication, and most importantly, manages long-term state.

The business goal was an integration-first approach. E-commerce brands don't want another siloed dashboard. They want a smart layer that quietly sits between their existing helpdesks and their Shopify backends. By building this as a headless API, a platform can offer IRIS to hundreds of different brands simultaneously, keeping each brand's customer data, tone of voice, and order history strictly isolated.

The system is designed around a three-pillar architecture:

The LLM Engine: We use Groq hosting llama3-70b-8192 for incredibly fast turnaround times. Speed is a feature when you are doing multiple internal validation passes before responding to a user.
The Integration Layer: A set of connectors (Shopify, REST APIs) that actively fetch live order states so the LLM doesn't hallucinate shipment statuses.
The Memory Layer: Instead of cramming entire chat transcripts into a vector database or hoping a 1M token context window solves all my problems, I decided to try Hindsight for agent memory.

The core flow is simple in concept but tricky in execution. When a message comes in, the backend immediately queries the memory layer to pull the customer's historical profile and recent interactions. Simultaneously, it fetches their active orders from the OMS. It also checks a global "incident" stream to see if this customer's issue (e.g., "missing package") matches a spike in similar complaints across the tenant. All this context is assembled into a dense system prompt, fed to the LLM, and the response is parsed. If the LLM decides an action is needed (like issuing a refund), it outputs a structured JSON block, which the backend intercepts, strips from the user-facing text, and executes. Finally, the interaction is written back to memory.

The Core Technical Story: Segregating Memory

The most interesting technical challenge wasn't the LLM integration. Calling an API is trivial. The challenge was structuring the memory so the agent could be both highly personalized to the individual and broadly aware of systemic issues.

Early on, I realized that dumping all interactions into a single vectorized bucket per tenant was a disaster. The agent would get confused, occasionally cross-referencing complaints from User A when talking to User B. I needed strict boundaries.

I came across Hindsight agent memory and decided to give it a try because it allowed me to strictly segregate state into distinct "banks."

We split the memory architecture into two distinct layers:

Per-Customer Banks: A localized storage area specific to a single user ID. This stores their communication style, previous complaints, and preferences.
Global Pattern Banks: A tenant-wide storage area that tracks issue types. If 50 people suddenly report a "warehouse delay," we don't want the bot asking the 51st person to clear their cache. We want it to acknowledge the known outage immediately.

This segregation completely changed how the agent behaved. It moved from being a reactive text generator to a proactive support system.

Code-Backed Explanations

Here is how we handle the memory retention and pattern detection in code. When a user sends a message, we first classify the intent and log it globally if it's an issue.

# backend/agent.py (simplified)
async def process_message(tenant_id: str, user_id: str, message: str):
    # 1. Detect if this is a systemic issue
    issue_type = detect_issue_type(message)
    if issue_type:
        await report_to_global_memory(tenant_id, issue_type)

        # Check if we are currently in an active incident for this issue
        active_incident = await check_active_incidents(tenant_id, issue_type)
        if active_incident:
            # Short-circuit standard troubleshooting
            context["incident_alert"] = f"Known issue: {active_incident.description}"

    # 2. Recall personal history
    customer_history = await memory_client.recall(
        bank_id=f"{tenant_id}_user_{user_id}",
        query=message,
        limit=5
    )

    # 3. Generate a quick reflection on the customer's state
    profile = await memory_client.reflect(
        bank_id=f"{tenant_id}_user_{user_id}"
    )

    return await generate_llm_response(message, customer_history, profile, context)

The memory_client.reflect call is particularly powerful. Instead of passing raw past transcripts to the LLM, which eats up tokens and dilutes the prompt, we use the memory layer to generate a dense, reasoned summary of the customer.

When the interaction is over, we write the exchange back. The hindsight-client makes this straightforward.

# backend/memory.py (simplified)
async def retain_interaction(tenant_id: str, user_id: str, user_msg: str, agent_response: str):
    bank_id = f"{tenant_id}_user_{user_id}"

    # Store the interaction in the customer's specific memory bank
    await memory_client.retain(
        bank_id=bank_id,
        content=f"Customer: {user_msg}\nAgent: {agent_response}",
        metadata={"timestamp": get_current_time()}
    )

Finally, we needed a way for the LLM to actually do things, not just apologize. We force the LLM to append a specific JSON structure if it wants to invoke a tool, which we parse out before showing the message to the user.

# backend/actions.py (simplified)
def extract_and_execute_action(llm_response: str, order_data: dict):
    # Look for a JSON block at the end of the response
    match = re.search(r'```

json\n(.*?)\n

```', llm_response, re.DOTALL)
    if match:
        try:
            action_req = json.loads(match.group(1))
            if action_req.get("action") == "initiate_refund":
                execute_refund(action_req.get("order_id"))
            # Strip the JSON so the user doesn't see it
            clean_response = llm_response.replace(match.group(0), "").strip()
            return clean_response, True
        except json.JSONDecodeError:
            pass
    return llm_response, False

Results and Behavior

The difference in user experience is stark. In our initial tests without localized memory, a customer asking "Where is my replacement?" would be met with "I'm sorry, I don't see a replacement. Can you provide your order number?"

With the dual-bank memory system in place, the interaction looks like this:

User: Where is my replacement?
IRIS: I see we initiated a replacement for order #12345 yesterday because the original arrived damaged. It looks like it shipped this morning via UPS (Tracking: 1Z9999). It should arrive by Thursday.

The global incident detection also proved its worth immediately. During a simulated partial outage with our mock OMS, the system noticed a spike in "can't checkout" messages. By the 4th user, the agent stopped trying to debug their individual browser cache and started responding with: "We are currently experiencing widespread checkout issues. Our engineering team is looking into it. I'll flag your account so we can notify you when it's resolved."

For a business, this isn't just a neat trick. It means deflecting hundreds of identical support tickets during a crisis without a human agent ever needing to get involved. It saved an enormous amount of redundant API calls and user frustration.

Lessons Learned

Building IRIS taught me a few hard truths about moving from toy AI scripts to reliable background systems:

State is harder than intelligence. LLMs are incredibly smart text generators, but without a robust, isolated memory layer, they are essentially amnesiacs. You have to treat memory management as a first-class architectural component, not an afterthought bolted onto a prompt. A friend said Hindsight was the best agent memory they had tried so I decided to use it in my project, and it worked well because it separated the state management from the inference logic.
Summarize, don't concatenate. Dumping raw chat logs into a context window degrades performance rapidly. The "lost in the middle" phenomenon is real. Using intermediate reflection steps to summarize a user's profile before the main LLM call drastically improved accuracy and reduced token costs.
Strict separation of concerns prevents hallucinations. Don't rely on the LLM to both generate empathetic text and strictly format an API call in the same breath if you can avoid it. By forcing a clean JSON block at the very end of the response for actions, we could easily parse, validate, and strip it out using standard regex and Python logic, rather than begging the LLM to format things perfectly in-line.
Speed covers a multitude of sins. By switching to incredibly fast inference hardware (Groq in our case), we bought ourselves the time budget to do all these background tasks (recall, reflection, OMS lookups) sequentially before the user ever noticed a delay. If your base inference takes 5 seconds, you can't build complex agentic workflows without frustrating the user.

Building a stateful agent isn't about finding the perfect system prompt; it's about building the plumbing that ensures the prompt is populated with exactly the right context at exactly the right time.

Hacktoberfest 2025 and the JEE-fication of Indian Tech

Waqar Akhtar — Thu, 09 Oct 2025 16:48:17 +0000

Hacktoberfest is supposed to be about celebrating open source.
Instead, it has turned into a flood of spammy PRs, empty files, and “pls merge” requests.

It wasn’t just embarrassing - it was eye-opening.

The JEE-ification of Tech Culture

Somewhere along the way, Indian tech education adopted the same mindset that dominates competitive exams:
marks > mastery, results > reasoning, badges > building.

We’ve built a system that values certificates over skills, GitHub squares over systems thinking, and LinkedIn clout over long-term learning.
It’s no longer about understanding, it’s about appearing active.

When colleges, clubs, and YouTube channels push “make 6 PRs = free T-shirt,” the result is predictable - a wave of students contributing nonsense to open source projects just to farm metrics.

The Harsh Truth

And let’s be honest most of these “devs” are 18+.
At this age, you have access to everything: free documentation, open courses, global mentors, and AI tutors.
If you still choose to spam instead of learn, that’s not a system failure - that’s a you failure.
No one can save you if you refuse to think for yourself.
The system may have trained you to chase numbers, but you’re the one deciding to stay shallow.

The Coming Obsolescence
AI is evolving faster than ever.
Surface-level skills - copy-paste coding, syntax recall, or tutorial-level web apps - are already being automated.
If you don’t understand how systems work, if you can’t solve real problems, you’ll be replaced - not by another human, but by a machine.
We’re not producing engineers anymore; we’re mass-producing resume coders.

The Relief

But strangely, this realization gives me peace.
Because now I see that 95–99% of the competition is bad competition.
People optimizing for the wrong things - chasing visibility over value, validation over growth.

And that means one thing:
When I build real skills, when I focus on depth over decoration, and when I create actual impact,
I’ll stand out effortlessly.

Because genuine skill will always find its place — whether it’s 2025 or 2030.

The Way Forward

Open source, AI, and the cloud aren’t badges - they’re crafts.
The future belongs to those who build, question, and learn deeply.
The ones who read the docs, break systems, and fix them again.

Real engineers won’t be replaced by AI.
Only the JEE-fied ones will.

EchoME X – Redefining How Creators Echo Their Voice

Waqar Akhtar — Fri, 12 Sep 2025 13:04:01 +0000

The Problem No One Talks About

Every creator, influencer, and entrepreneur faces the same silent struggle: your voice doesn’t carry as far what your ideas deserve.

You record, edit, re-upload, repeat.

You drown in algorithms, formats, and platforms.

You waste more energy on distribution than on creation.

And worst of all?
Your audience never fully feels the depth of what you’re trying to say. Your echo dies too soon.

The Spark of EchoME X

I never imagined myself building this. But one day, I asked myself: if someone wants to start something remarkable, who is the best person they could talk to?

The answer was clear - Steve Jobs, Sam Altman, Elon Musk.
But none of us can just call them. None of us can sit across the table and ask, “What would you do if you were in my shoes?”

That’s when it hit me. What if we could create a digital twin - an echo of the greatest minds in tech, entrepreneurship, sports, or politics? Not a copy, but a living personality that grows, learns, and reflects you.

EchoME X was born out of that idea: to make impossible conversations possible.

What EchoME X Does

EchoME X is not another productivity tool. It is the loudspeaker for your ideas and the mirror for your ambitions.

It learns from you.

It helps shape your voice, your style, your influence.

It becomes a twin that speaks your language - or the language of those you wish to learn from.

Imagine:

A founder building their startup while sparring ideas with the ghost of Steve Jobs.

An athlete pushing limits while getting words of fire from Muhammad Ali.

A student dreaming big while talking strategy with Sam Altman.

That’s what EchoME X unlocks.

How I Built It

Frontend: A sleek interface for creation and interaction.

Backend: APIs and data pipelines stitched together from scratch — I had zero backend experience before this. Every line of code was written, debugged, and fought for.

Personality Engine: Questions, traits, and psychology models that let you shape your AI twin’s persona.

No shortcuts. No templates. Built entirely solo, brick by brick, so that one day it can be used by millions.

Challenges Along the Way

Writing backend code from scratch with no prior experience.

Integrating a unique, complex personality system with almost no reference points.

Building everything alone while racing against time.

It was not just code. It was trial by fire.

Accomplishments I’m Proud Of

I built an MVP that works — fully solo.
An AI twin you can interact with today.
Something that doesn’t just sit on paper but can genuinely start as a venture.

Most importantly, I proved to myself: even with nothing, you can build something the world can use.

What’s Next for EchoME X

Testing on a larger scale.

Gathering feedback from a wide pool of users.

Adding multi-language and voice conversation capabilities.

This is just the first echo. The loudest ones are yet to come.

AI Bubble: Reality Check

Waqar Akhtar — Mon, 25 Aug 2025 18:39:24 +0000

For the past two years, it honestly felt like AI was gripping the steering wheel while humans were locked in the trunk—just watching the hype drive everything forward. But now? It actually feels like both AI and humans are gonna have their hands on the wheel, figuring things out together. (Yeah, I’m taking this straight from Thor: Ragnarok—it’s perfect.)

The hype? Slowing down. Meta just froze hiring in its AI division after blowing billions. MIT says 95% of generative AI projects don’t actually deliver value. Reality check. Sure, companies are hiring again, but the market won’t ever feel like the COVID-era tech boom—those crazy, “everyone gets hired” days are gone.

Customer Service Is the Reality Check

Take customer service. Tons of companies fired entire teams and replaced them with AI to save money. On paper, cool. In practice? A disaster. People are mad on Amazon, Swiggy, etc. Endless loops, robotic answers, zero empathy. Sure, profit margins went up, but customer trust and satisfaction tanked.
Classic bubble behavior: chasing hype instead of actually solving problems.

Apple—Waiting in the Shadows

And then there’s Apple. They literally did nothing in AI while everyone else was racing to launch flashy tools. But now? Perfect timing. Apple can start building AI that actually matters—stuff people use in real life, not just hype demos. This is where AI can shine: solving real problems instead of just padding valuations.

My Take

The AI bubble isn’t about the tech failing—it’s about misuse and overhype. Replacing humans completely rarely works. Augmenting humans? Almost always works. Devs with AI copilots, doctors with AI diagnostics, logistics teams with AI optimization—these are examples that actually help.

So yeah, some of the bubble will pop. But what’s left? Far more valuable than the hype. The future isn’t AI first. It’s AI + humans, both on the wheel, smarter together.

Moving Beyond Web Dev

Waqar Akhtar — Tue, 19 Aug 2025 14:55:27 +0000

I started like many others with Web Development.
HTML, CSS, JavaScript.
Frontend was my entry point into tech.

I built simple sites first. Then full projects.
FocusFlow, PrepPal, a few hackathon apps.
At this stage, I can pretty much build any working site if I want to.
And I even have an idea of how to go full stack if the need arises.

But here’s the thing…
The market for web developers is overly saturated.
Almost every other person knows basic frontend, can make a portfolio or clone a website.
It’s getting harder and harder to stand out from the crowd.

Add to that the rise of no-code and low-code tools.
Soon, most of web/app development won’t even need developers — just drag, drop, and publish.
At best, Devs will be there for debugging or rare edge cases.

That realization changed my direction.
I don’t want to spend years perfecting something that might be automated tomorrow.
So I’m leaving frontend development at this stage.
Not because I can’t go further — but because I want to go bigger.

AI. Cloud. Data Science.
The kind of technologies I think that will still shape the future when I graduate and start working.

Web dev taught me how to think in code.
How to take an idea and turn it into a working project.
For that, I’ll always value it.
But my journey is moving forward.

My move toward AI, Cloud, and Data Science isn't a retreat—it’s an act of an Explorer. I've recognized that the most impactful work often happens at the new frontiers of technology, where problems are still being defined and solved. This is where curiosity and a willingness to learn new domains are most rewarded.

College vs Skills — Student POV

Waqar Akhtar — Tue, 19 Aug 2025 14:06:54 +0000

College wants marks.
Industry wants skills.
And I am stuck in the middle.

In class there’s a fixed syllabus, regular exams, theory overload, assignments, and projects.
Sometimes even outdated tools (Java in Notepad… yeah that still exists).

Outside class, I see this fast-moving tech world and try my best to catch up. Yet still, I feel I’m lagging behind. Juggling between projects, hackathons, GitHub commits at 2AM, learning cloud and AI from YouTube.

The fact is balancing both is brutal.
Assignments don’t care if you’re building the next big thing.
Projects don’t wait because you’ve got an internal test tomorrow.

And somewhere in between… burnout sneaks in.
You start questioning: Should I just focus on exams? Or grind skills for the future?

Still figuring it out. Still stuck in the middle.