DEV Community: Simon Mak

Building humane behaviour into AI agents with MCP skill packs

Simon Mak — Sat, 30 May 2026 16:05:19 +0000

AI agents are becoming useful because they can call tools, inspect state, edit files, and iterate toward goals. That power creates a practical engineering question: which behaviours should be available to an agent as structured tools rather than as vague prompt instructions?

Humanity4AI is my attempt to answer a narrow part of that question. It is an open-source project that packages humane AI behaviours into reusable skill packs with MCP action contracts, schemas, runtime stubs, and evaluation gates.

The goal is not to claim that a skill pack solves AI safety. The goal is more modest and more testable: make behaviours such as uncertainty disclosure, supportive communication, accessibility awareness, inclusive design checks, and responsible escalation inspectable by developers.

Why make this a tool layer?

Prompt instructions are easy to write and easy to ignore. A tool layer gives agent developers something more concrete: named actions, inputs, outputs, validation rules, and tests. If an agent can call a database tool or a file-editing tool, it can also call a humane-behaviour tool when the situation calls for it.

Humanity4AI currently focuses on the practical layer around agent behaviour. The repository includes skill files, MCP contracts, examples, and evaluation checks so that contributors can improve the system in small, reviewable pieces.

Try it locally

git clone https://github.com/humanity4ai/project_human.git
cd project_human
pnpm install
pnpm check
pnpm evals
pnpm start

Then configure an MCP-capable agent to expose the Humanity4AI server and inspect the available tools through tools/list and tools/call.

What I would like feedback on

I am especially interested in feedback from developers building or testing agents. Are MCP action contracts the right abstraction for humane behaviour? Are the skill boundaries clear? Which examples would make the project easier to evaluate quickly? Which behaviours should be explicit tools, and which should remain general prompt guidance?

Project website: https://humanity4ai.simonmak.com/

GitHub repository: https://github.com/humanity4ai/project_human

Why AI Agents Need Accessibility Skills: Building WCAG AAA Compliance Into AI Code Generation

Simon Mak — Sun, 15 Feb 2026 13:35:53 +0000

I have open-sourced a toolkit that is both a traditional design system and an AI agent skill for building WCAG 2.2 AAA compliant web applications.

GitHub: simonplmak-cloud/wcag-aaa-web-design

My goal was to solve two problems at once:

The ESG Problem

Companies need a reliable way to meet digital accessibility requirements (such as the European Accessibility Act, effective June 2025) for their ESG social responsibility goals. Digital inclusion is a key part of the "S" in ESG. This toolkit provides a production-ready, token-based design system for building enterprise web applications that meet the highest accessibility standard.

The AI Problem

AI coding agents are powerful, but they often generate inaccessible code. This creates a future where the automated web is unusable for people with disabilities. This project is structured as an AI agent skill, meaning an AI assistant can use it to autonomously build a fully compliant website by enforcing accessibility at the component and template level.

Why These Two Problems Are Connected

AI agents navigate the web using the same Accessibility Tree as screen readers. Research shows agents are significantly more effective on accessible sites (~85% task success vs. ~50% on inaccessible ones). By making accessibility a core part of the AI development process, we ensure the agent-driven web is inclusive by default.

What the Toolkit Includes

A full, token-based corporate design system (CSS custom properties, no hardcoded values)
Secure, responsive, accessible HTML/CSS/JS templates (header, footer, data tables, sidebar navigation, empty states)
In-depth reference guides on WCAG 2.2 AAA compliance, ARIA patterns, enterprise UX, security, and error handling
Automated validation scripts (contrast checking, pa11y auditing)
Framework-agnostic: works with any tech stack

The Bigger Picture

This is an attempt at responsible innovation. As AI agents increasingly write our code and navigate our websites, accessibility is no longer just about compliance. It is the shared interface between humans and machines. Building for accessibility is building for the AI agent economy.

I would welcome any feedback on this approach, or contributions to the project.

Repository: github.com/simonplmak-cloud/wcag-aaa-web-design
License: MIT

Building a Financial Data Pipeline: How I Scraped 25 Years of Stock Market Filings with Python and a Graph Database

Simon Mak — Sat, 14 Feb 2026 13:30:22 +0000

The Discovery: An Undocumented JSON API

The official HKEx website is a maze of JavaScript and session-based navigation. Scraping it directly with tools like Selenium would be slow, brittle, and a constant maintenance headache. I knew there had to be a better way.

After some digging in my browser's network tab while using the official search portal, I found a hidden gem: an undocumented JSON API. The website's frontend was making calls to a titleSearchServlet.do endpoint that returned clean, structured JSON data. This was the key. By mimicking these API calls, I could bypass the browser entirely and get the data directly from the source.

The Stack: Python, Requests, and SurrealDB

With the API discovered, I chose a simple but powerful stack:

Python: For its rich data processing ecosystem and ease of use.
Requests: A straightforward library for making the necessary HTTP calls to the HKEx API.
SurrealDB: A multi-model database that was a perfect fit for this project. I could store the filing metadata as structured documents and, more importantly, create graph relationships between companies and filings.

Architecture: A Two-Phase Pipeline

The process is broken down into two main phases: scraping the metadata and then enriching it with the full document content.

Phase 1: Scraping Filing Metadata

The first step is to fetch the metadata for every filing. Since the HKEx API limits searches to one-month intervals when no stock code is specified, I had to generate monthly date chunks and iterate through them.

Here's how the generate_monthly_chunks function works:

def generate_monthly_chunks(date_from: datetime, date_to: datetime) -> List[Tuple[datetime, datetime]]:
    """Generate (chunk_from, chunk_to) pairs in 1-month increments (newest first)."""
    chunks: List[Tuple[datetime, datetime]] = []
    cursor = datetime(date_to.year, date_to.month, 1)
    while cursor >= datetime(date_from.year, date_from.month, 1):
        chunk_start = max(cursor, date_from)
        _, last_day = monthrange(cursor.year, cursor.month)
        chunk_end = min(datetime(cursor.year, cursor.month, last_day), date_to)
        chunks.append((chunk_start, chunk_end))
        if cursor.month == 1:
            cursor = datetime(cursor.year - 1, 12, 1)
        else:
            cursor = datetime(cursor.year, cursor.month - 1, 1)
    return chunks

For each chunk, the fetch_chunk_via_api function first sends a POST request to the search page to set the date range in the server's session, then makes paginated GET requests to the JSON API endpoint to retrieve all the records.

The raw JSON from the API looks like this:

{
    "FILE_INFO": "53KB",
    "NEWS_ID": "12022263",
    "STOCK_NAME": "ZHONGTAIFUTURES",
    "STOCK_CODE": "01461",
    "TITLE": "Articles of Association",
    "FILE_TYPE": "PDF",
    "DATE_TIME": "11/02/2026 19:10",
    "FILE_LINK": "/listedco/listconews/sehk/2026/0211/2026021100854.pdf"
}

This data is parsed, cleaned, and stored in a SCHEMAFULL table in SurrealDB called exchange_filing.

Phase 2: Downloading and Extracting Content

With the metadata in place, the next step is to download the actual filing documents (PDF, HTML, or Excel) and extract their content. This is done in parallel using a ThreadPoolExecutor for efficiency.

def _download_document(url: str, filing_id: str) -> Tuple[bytes, int, str]:
    # ... (implementation to download the document)

Once downloaded, the text and any structured tables are extracted using PyMuPDF for PDFs and BeautifulSoup for HTML. This extracted content is then saved back to the corresponding record in the exchange_filing table.

The Graph Model: Connecting the Dots

This is where SurrealDB's multi-model capabilities shine. I wanted to not only store the filings but also understand the relationships between them. I defined two types of graph edges using TYPE RELATION:

(company)-[has_filing]->(filing): This links a company to the filings it has released.
(filing)-[references_filing]->(company): This links a filing to other companies mentioned in its title.

This simple graph model allows for powerful queries, such as "find all filings from company X that mention company Y," which would be complex and slow to execute in a traditional relational database.

Here's a snippet of the code that creates the has_filing edges:

def link_filings_to_companies(ticker_set: set | None = None) -> int:
    # ...
    log("Linking filings to companies via graph edges...")
    # ...
    update_query = (
        f"UPDATE {COMPANY_TABLE} SET filings += {filing_id}; "
        f"RELATE {company_id}->has_filing->{filing_id} CONTENT {{ at: {filing_date} }};"
    )
    # ...

From a Single Script to an Open-Source Project

The initial version of this tool was a single, 1500-line Python script. While functional, it was difficult to maintain and not very user-friendly. I decided to refactor it into a proper, modular open-source project.

This involved:

Decoupling Dependencies: Removing hardcoded dependencies on my private company data table, making the graph linking feature optional and configurable.
Modularization: Breaking the monolithic script into logical modules (api.py, db.py, extractor.py, etc.).
Packaging: Creating a pyproject.toml file to make the project installable via pip.
CLI: Building a user-friendly command-line interface with argparse.
Documentation: Writing a comprehensive README.md with installation instructions, configuration details, and usage examples.

Conclusion and Next Steps

The result is hkex-filing-scraper, a robust and easy-to-use tool for building a comprehensive database of HKEx filings. It's now available on GitHub and installable via PyPI.

GitHub Repo: https://github.com/simonplmak-cloud/hkex-filing-scraper

This project was a fun journey into reverse engineering, data pipeline design, and the power of multi-model databases. Future plans could include adding support for other exchanges like the SEC EDGAR database or building a web interface to explore the data.

I encourage you to check out the repository, try it out for your own financial analysis projects, and contribute if you find it useful. Feedback is always welcome!