Craig ML Dsouza

Posted on Apr 3

Building an AI Research Agent That Uses Real Data (Wiki + Finance)

#webdev #ai #programming #agents

I Built an AI Agent That Uses Real Data Instead of Just Guessing

Most AI tools today just generate text.

They rely on model memory, often guess information, and produce inconsistent outputs.

I wanted to explore a different approach — what if an AI system could fetch real data, process it step-by-step, and return structured insights instead of raw text?

So I built OpenAgent.

What is OpenAgent?

OpenAgent is a multi-step AI research agent designed to move beyond basic text generation.

Instead of relying only on the model, it:

pulls data from Wikipedia for context
fetches market data from Yahoo Finance
processes everything through a structured pipeline
outputs clean, structured insights

Why this matters

Most AI systems:

generate unstructured text
mix signal with noise
are difficult to integrate into real applications

OpenAgent focuses on:

real data instead of guesses
structured outputs instead of paragraphs
step-by-step processing instead of single-pass generation

How it works

Each query goes through a multi-phase pipeline:

Planning → Execution → Signal Extraction → Synthesis

Planning
Determines which tools to use (Wikipedia, Finance)

Execution
Fetches real data from external sources

Signal Extraction
Filters high-value information from raw data

Synthesis
Generates structured output with key insights

Example

Prompt:

Analyze Microsoft (MSFT)

Output:

{
  "summary": "...",
  "keyInsights": ["..."],
  "risks": ["..."],
  "opportunities": ["..."],
  "sentiment": "POSITIVE",
  "confidenceScore": 82
}

Instead of a long paragraph, you get usable, structured data.

Demo

The system fetches real data, processes it, and generates structured results in real time.

What you can do with it

Analyze stocks and companies
Perform quick research with real data
Build structured AI workflows
Use outputs directly in applications

Key takeaway

LLMs are powerful, but raw text output is often not enough.

By combining:

external data sources
structured pipelines
controlled outputs

you can build systems that are more reliable and usable in practice.

Try it out

GitHub:
https://github.com/CraigMLdsouza/OpenAgent

Full version:
https://craigstorm.gumroad.com/l/openagent-research

Final note

This is a developer-focused project aimed at exploring more reliable AI systems.

If you're building with AI, moving beyond text generation into data-driven agents is a direction worth exploring.

Top comments (1)

Shriya Saxena • Apr 3

This is sick honestly. AI that actually fetches real data and turns it into structured insights is way more useful than just text generation. Good Job Craig!! Keep it up