DEV Community: 兆鹏于

AI Skill Practical Training Course — Student Step-by-Step Guide

兆鹏于 — Sun, 05 Jul 2026 02:00:41 +0000

AI Skill Practical Training Course — Student Step-by-Step Guide

Target Audience: Students enrolled in the AI Skills Competition

Course Goal: Complete each challenge and take home a ready-to-run Skill

Prerequisite: TeleAgent desktop application installed

Pre-Course Preparation (5 minutes)

1. Verify Your TeleAgent Is Ready

Open TeleAgent desktop and confirm:

Check Item	Action	Expected Result
Version	Click top-right gear icon → About	Version >= 1.2.0
Skill Directory	Left sidebar → Skills	Can see "My Skills" list
Network	Send a test message	Receives normal response

2. Download Course Resource Pack

Download from the course link:

skill-templates.zip — 8 in-class Skill templates
demo-data.zip — Demo data (sanitized)
cheat-sheet.pdf — Quick reference card (one page)

3. Import Your First Skill (Pre-class Exercise)

Steps:

Extract skill-templates.zip
Open TeleAgent → Skills → Import Skill
Select the l1-01-info-extractor-archiver folder
Click "Import" → See "Import Successful" message

If import fails: Check if the folder contains a skill.yaml file — this is the Skill's "ID card"

Challenge Roadmap

Challenge 1: Info Extractor & Archiver  →  Challenge 2: Daily Report Generator
[30 min]                                    [30 min]
Difficulty: *

Challenge 3: Material Audit Assistant   →  Challenge 4: Knowledge Base Q&A
[30 min]                                    [30 min]
Difficulty: **

Challenge 5: Permission Self-Check      →  Challenge 6: Message Linker
[30 min]                                    [30 min]
Difficulty: **

═════════════════════════════════
Challenge 7: Unified Entry Gateway (L2) →  Challenge 8: Multi-Agent Complaint Flow
[45 min]                                    [45 min]
Difficulty: ***

Completion Reward: 8 Skills + 1 Skill Certification

Challenge 1: Information Extraction & Archiving (30 min)

Challenge Goal

Learn to create a Skill that extracts structured information from messy chat records or OCR text and auto-archives it.

Real-World Application

Inspection photos from work group → Auto extract "time, location, issue, handler"
Customer says "My broadband is broken" → Auto extract "fault type, address, contact"
Receive Excel table → Auto extract key fields and organize

Challenge Steps

Step 1: Import Template (5 min)

Path: TeleAgent → Skills → Import Skill → Select l1-01-info-extractor-archiver

You should see:

Skill name: Info Extractor & Archiver
Version: 1.0.0
Level: L1 (Basic)

Step 2: Understand Parameter Configuration (10 min)

Open the Skill config panel, you'll see three parameters:

Parameter	Type	Required	Default	Purpose
input_content	Text	Yes	-	Paste chat records, OCR results, or file summaries
output_format	Select	No	JSON	Output format: JSON/Markdown/Table
strict_mode	Toggle	No	Off	Strict mode: error on missing fields

Try it:

Paste this simulated chat record in input_content:

   [Work Group - Nanchang Maintenance]
   Zhang San 09:15: Optical cable cut on Beijing East Road, Qingshanhu District, affecting 3 neighborhoods
   Li Si 09:20: Sent Master Wang to the site, estimated 2-hour recovery
   Wang Wu 09:45: Site photos sent, excavator construction caused it, police notified

output_format: "JSON"
strict_mode: "Off"
Click "Run"

Expected output:

{
  "status": "success",
  "data": {
    "time": "09:15",
    "location": "Beijing East Road, Qingshanhu District",
    "event": "Optical cable cut",
    "impact": "3 neighborhoods",
    "handler": "Master Wang",
    "estimated_recovery": "2 hours",
    "site_status": "Excavator construction, police notified"
  },
  "missing_fields": [],
  "confidence": 0.92
}

Success marker: See the JSON output above — Skill correctly extracted information

Step 3: Modify for Your Own Scenario (10 min)

Customize extraction fields:

time → keep
location → keep (or change to "area")
event → change to "fault_type"
impact → change to "affected_users"
handler → keep
estimated_recovery → change to "resolution_deadline"

Step 4: Connect to Feishu Table (5 min, optional)

Configure auto-archiving to a Feishu multi-dimensional table:

Configure Feishu MCP connector in TeleAgent
Add output configuration with field mapping

Challenge 2: Daily/Weekly Report Auto-Generation (30 min)

Challenge Goal

Create a Skill that reads data sources, auto-calculates metrics, and generates formatted reports.

Real-World Application

Daily fault ticket summary → "Broadband Fault Daily Report"
Weekly sales data → "Business Weekly Report"
Monthly complaint data → "Service Quality Monthly Report"

Challenge Steps

Step 1: Import Template

Import l1-02-daily-weekly-report-generator

Step 2: Configure Data Source (10 min)

Try with simulated data:

Paste fault ticket statistics
Select "Daily Report" template
Run and see formatted output with key metrics, TOP3 issues, and recommendations

Step 3: Connect Real Data Source (10 min)

Three options:

Option A: Connect Excel file
Option B: Connect Feishu table
Option C: Manual paste (simplest)

Step 4: Customize Report Template (5 min)

Edit the Skill → Find "report_templates" → Copy and modify the daily report template.

Challenge 3: Material Audit Assistant (30 min)

Challenge Goal

Create a Skill that auto-checks submitted materials for completeness and format compliance.

Real-World Application

Review project proposals → Check for "budget table, feasibility report, risk assessment"
Review contract attachments → Check for "signature page, stamp page, attachment list"
Review expense claims → Check for "receipts, approval forms, detail sheets"

Key Concept: Audit Rule Table

Rule Name	Check Content	Required	Format Requirement
Application Form	Is form present	Yes	File exists
Budget Detail	Is budget table present	Yes	Excel format, amount column exists
Feasibility Report	Is report present	Yes	Word/PDF, >= 5 pages
Risk Assessment	Is assessment present	No	Bonus if present, warning if absent
Signature Page	Is signature present	Yes	Image/PDF, contains signature

Try uploading a simulated material pack (3 files, deliberately missing 1) and run the audit.

Challenge 4: Knowledge Base Q&A Assistant (30 min)

Challenge Goal

Create a Skill that connects to a knowledge base and answers employee questions.

Real-World Application

New employee asks "How to apply for VPN?" → Auto find answer from KB
Customer service asks "What does this error code mean?" → Auto query fault KB
Sales asks "What does this plan include?" → Auto query product KB

Two Ways to Configure Knowledge Base

Method A: Upload Documents (Simple)

Prepare Word/PDF/Markdown knowledge docs
Upload to Skill config
Skill auto-parses and builds index

Method B: Connect Existing KB (Advanced)

Configure KB MCP connector in TeleAgent
Specify KB ID or URL
Skill auto-queries and returns answers

Challenge 5: Permission Self-Check (30 min)

Challenge Goal

Learn to check Skill permission configurations, ensuring "least privilege principle" and avoiding security risks.

Why It Matters

A Skill with "delete all data" permission can cause severe damage if misused or attacked
Customer requirement: Skills must have minimized permissions, delete operations require double confirmation

Key Concept: Permission Audit

The Skill auto-checks another Skill's permission config and reports:

Risky permissions (e.g., unnecessary delete or broad data access)
Recommendations for permission tightening
Risk level rating (LOW/MEDIUM/HIGH)

Challenge 6: Message Linker (30 min)

Challenge Goal

Create a Skill that monitors enterprise WeChat group messages and auto-triggers other Skills based on keywords.

Real-World Application

Someone says "fault" in group → Auto invoke fault diagnosis Skill
Someone says "daily report" → Auto invoke report generator Skill
Someone says "audit" → Auto invoke material audit Skill

Keyword Rule Configuration

Keyword	Match Type	Invoke Skill	Reply Method
fault/outage/offline	Contains	KB Q&A	Reply in group
daily report/weekly/stats	Contains	Report Generator	Direct message
audit/check/materials	Contains	Material Audit	Direct message
help/how to	Contains	Usage Guide	Reply in group

Advanced Challenge: L2 Skills (90 min)

Challenge 7: Unified Entry Gateway (45 min)

Challenge Goal

Create a "smart front desk" Skill that understands user requests and auto-routes to the corresponding L1 Skill.

Scenario

User says: "Check yesterday's fault daily report for Nanchang, and also see if there are any complaints in Qingshanhu District"
→ Gateway understands: Two requests (daily report + complaints)
→ Auto invokes: Report Generator + KB Q&A
→ Merged response: One complete report

Key Concepts

Intent Recognition: Determine what the user wants (query data? generate report? audit materials?)
Routing Rules: Based on intent, dispatch to the corresponding Skill
Result Merging: Combine results from multiple Skills into one response

Challenge 8: Multi-Agent Complaint Handling Flow (45 min)

Challenge Goal

Create a workflow Skill that auto-handles customer complaints: Receive → Classify → Dispatch → Process → Respond.

Scenario

Customer complains: "My broadband hasn't been fixed for 3 days, customer service said someone was sent but no one came!"
→ Auto classify: Fault complaint + Service complaint
→ Auto dispatch: Technical dept + Customer service dept
→ Auto process: Check ticket status + Escalate
→ Auto respond: Inform processing progress and estimated resolution

Flow Nodes

[Receive Complaint] → [Sentiment Analysis] → [Classify] → [Dispatch] → [Process] → [Respond] → [Archive]
        ↓                    ↓                  ↓           ↓           ↓           ↓           ↓
   Extract info         Assess urgency     Tech/Service  Find owner  Check system  Generate reply  Write to table

Post-Course Skill Certification

Certification Standards

Level	Requirement	Certification Content
Bronze	Complete L1 x 6	Basic Skill config + simple modifications
Silver	Complete L1 x 6 + L2 x 1	Can configure routing rules
Gold	Complete all 8 challenges	Can independently design multi-Agent workflows

Certification Method

Submit your 8 Skill configuration files
Record a 3-minute demo video (any Skill running)
Complete the student feedback form

Appendix

A. Quick Reference Card

Import Skill: Skills → Import → Select folder → Confirm
Edit Skill:   Click Skill → Edit → Modify parameters → Save
Test Skill:   Enter parameters → Click Run → View output
Debugging:    Check logs → Check parameters → Check permissions

Parameter Types: Text/Number/Toggle/Select/File
Required Mark:   ✅ Must fill  |  ❌ Can leave blank
Default Value:   Auto-used when not specified

Common Errors:
• yaml indent error → Use 2 spaces, not Tab
• Permission denied → Contact admin to enable
• Data source unreachable → Check network/URL/permissions
• Wrong output format → Check output.format config

B. Troubleshooting Guide

Problem	Debug Steps	Contact
Skill import fails	1.Check skill.yaml format 2.Check file encoding (UTF-8) 3.Check indentation (2 spaces)	Instructor/TA
No output on run	1.Check if parameters are filled 2.Check network 3.View run logs	Instructor/TA
Wrong output format	1.Check output.format config 2.Check template syntax 3.Check data source format	TA
Permission denied	1.Check Skill permission config 2.Contact admin 3.Use least privilege	Admin
Feishu connection fails	1.Check Feishu URL 2.Check permissions 3.Check network	Feishu Admin

C. Post-Course Resources

Course Repository: https://github.com/yuzhaopeng-up/openclaw-workspace
TeleAgent Official: https://www.teleai.com.cn/product/super-agent
ClawHub Skill Marketplace: https://clawhub.ai
More Skills by Yu Zhaopeng: financial-ai-skills (104 financial AI skills), skill-framework (208-skill classification), soe-compliant-office (20 SOE-compliant Skills)

Version: v1.0 | Updated: 2026-06-20 | Course: AI Skill Practical Training

Ten Layers of AI Skill Construction: A Systematic Framework from Prompts to Business Closed Loops

兆鹏于 — Sun, 05 Jul 2026 01:52:03 +0000

Ten Layers of AI Skill Construction: A Systematic Framework from Prompts to Business Closed Loops

Large model applications are evolving from "conversational Q&A" to "skill-based execution." When an AI assistant no longer just chats with you but can automatically complete an entire business workflow, it needs more than a good prompt—it needs a structured Skill system.

But here's the problem: How should Skills be constructed? The gap between the simplest prompt file and an end-to-end business closed loop is enormous. Many teams don't know which stage they're at, let alone where to go next.

This article distills AI Skill construction into ten progressive layers, from the most basic pure-prompt Skill to the most complex master-level business closed loop, forming a complete systematic framework. Each layer has clear capability boundaries, typical structures, and evaluation criteria to help you定位 your current level and plan your upgrade path.

Layer 1: Pure Prompt Skill — The Zero-Code Starting Point

This is the most basic form of Skill construction: a single Markdown file containing role definitions, behavioral rules, and output format requirements. No code, no scripts—completely dependent on the large model's language understanding to execute tasks.

Typical Structure: Single SKILL.md file

Core Capability: Through carefully designed prompts, enable AI to produce more accurate and standardized output in specific scenarios. For example, a "Meeting Minutes Organizer Skill" only needs to specify in Markdown: which dimensions to extract information from, what format to output, and which fields are required.

Evaluation Criterion: Your Skill has only one file, and the AI executes without calling any external tools, completing tasks entirely through "read instructions → understand → output."

The value of this layer is often underestimated. A well-written prompt Skill may outperform a poorly coded Skill. The key is: Are the rules specific enough? Are the boundaries clear enough? Are the examples representative enough?

Layer 2: Component Skill — Structured Enhancement with Resources

When pure prompts aren't enough, you need to "equip" the AI. This is the Component Skill: adding a references directory (reference materials), scripts directory (execution scripts), and assets directory (template resources) on top of the SKILL.md.

Typical Structure: SKILL.md + references/ + scripts/ + assets/

Core Capability: AI can consult reference documents during execution, call scripts to process data, and use templates to generate files. The Info-Extractor is a typical example—SKILL.md defines extraction rules, references contain field mapping tables, and scripts might include formatting utilities.

Evaluation Criterion: Your Skill has multiple files, and the AI needs to read documents from references to guide its behavior or call scripts for specific operations.

The key breakthrough at this layer is moving from "AI figuring it out" to "AI having references to consult." Reference materials give AI's judgments a basis; scripts give AI's actions guarantees.

Layer 3: Workflow Skill — Multi-Step Decision Trees

Tasks that can't be completed in a single call need to be broken down into multiple steps, each with its own judgment logic. Workflow Skills introduce decision tree structures: what to do first, what to do next, and which branch to take under what conditions.

Typical Structure: SKILL.md includes a Workflow section with Step 1 → Step 2 → Step 3, each step having prerequisites and deliverables.

Core Capability: Breaking down complex tasks into ordered step sequences, each step with clear input, processing logic, and output. For example, a Data Analysis Skill: Step 1 Data Validation → Step 2 Statistical Calculation → Step 3 Anomaly Detection → Step 4 Insight Generation.

Evaluation Criterion: Your Skill has a clear sequence of steps, with data passing between steps and conditional branches (if-else logic).

The breakthrough at this layer is "from one-shot to procedural." AI no longer "sees a question and answers" but "executes step by step," with each step's output becoming the next step's input.

Layer 4: Orchestration Skill — Multi-Agent Coordination

When a Skill's steps become too complex, or certain steps require completely independent contexts, you need multiple AI Agents each responsible for one step, passing information between them through structured data. This is the core idea of Orchestration Skills.

Typical Structure: Phase-Orchestrator mandatory orchestration protocol—each Phase executed by an independent sub-Agent, with JSON data passing between Phases.

Core Capability: True multi-Agent parallel or sequential collaboration. Phase 1's Agent completes information extraction, passes results in JSON format to Phase 2's Agent for analysis, which passes to Phase 3 for security review, and so on.

Evaluation Criterion: Your Skill explicitly uses Phase-Orchestrator scheduling, with each Phase being an independent sub-Agent and structured JSON data passing protocols between Phases.

This is the watershed of Skill construction. The first three layers are all "one Agent does everything"; from Layer 4, it becomes "multiple Agents collaborate to accomplish one thing." The benefit is cleaner contexts and more focused responsibilities for each Agent; the downside is significantly increased orchestration complexity.

Layer 5: Security Skill — Permission Control and Protection

When Skills start acquiring the ability to call external tools, access databases, and manipulate files, security becomes a must-have built-in capability rather than an optional add-on. The core of Security Skills is the principle of least privilege: no permissions by default; all permissions must be explicitly declared and approved.

Typical Structure: Security-Guard component—checking permission configurations, data access scope, sensitive field handling, outbound/delete actions, and audit logs, outputting risk ratings and remediation suggestions.

Core Capability: Performing security reviews before Skill execution, identifying excessive permissions, sensitive data leakage risks, and high-risk operations, providing L1-L5 risk ratings.

Evaluation Criterion: Your Skill system has a dedicated security review component, and any Skill involving data access, external communication, or file operations must pass security review before execution.

This layer addresses the pain point of "AI being too capable and therefore dangerous." A Skill without security controls is like a car without brakes—the faster it goes, the greater the risk.

Layer 6: Scoring Skill — Rule Engine Parameterization

Business scenarios often require "scoring by rules": customer opportunity scoring, supplier evaluation, churn risk prediction... These scoring rules change with business conditions. If rules are hardcoded in Skills, every rule change requires modifying the Skill; if rules are parameterized in YAML configuration, business personnel only need to modify the configuration without changing the Skill.

Typical Structure: Scoring-Engine—rules stored in YAML configuration, Skill only responsible for "read rules → execute scoring → output results," 4-Phase mandatory orchestration (information extraction → knowledge retrieval → data analysis → report generation).

Core Capability: Separation of business rules from execution logic. When rules change, modify YAML; when processes change, modify the Skill—neither interferes with the other.

Evaluation Criterion: Your Skill uses external configuration files (YAML/JSON) to store business rules, reading rules dynamically during execution rather than hardcoding them.

This layer embodies an important engineering principle: separation of configuration and code. In the Skill context, this is even more significant—because a Skill's "code" is its prompt, which is easier to break when modified, making it even more important to externalize volatile parts.

Layer 7: Validation Skill — Multi-Source Evidence Cross-Validation

When decisions depend on multiple data sources, information from a single source may be unreliable. The core capability of Validation Skills is extracting evidence from multiple independent sources, cross-validating, detecting contradictions, and ultimately providing confidence-scored judgments.

Typical Structure: Evidence-Chain—receiving multi-source data (complaint records, system alerts, operation logs, SLA metrics), extracting evidence → cross-validation → conflict detection → confidence assessment → root cause inference.

Core Capability: Not "trusting one data source" but "letting multiple data sources corroborate each other." If two of three sources say A and one says B, it's not about majority rule—it's about analyzing why B contradicts A: data latency, inconsistent metrics, or a genuine anomaly.

Evaluation Criterion: Your Skill obtains information from at least 2 independent data sources, has explicit cross-validation logic, and outputs include confidence scores.

This layer addresses the "information credibility" problem. In enterprise AI applications, misjudgments from single information sources carry extremely high costs; multi-source cross-validation is a key means of risk reduction.

Layer 8: Approval Skill — Human-in-the-Loop Risk Control

When AI needs to execute high-risk operations (sending group messages, modifying customer data, deleting work order records), it must not execute directly—human confirmation is required. Approval Skills implement the "human-in-the-loop" mechanism.

Typical Structure: Human-In-Loop component—automatically assesses operation risk level (L1-L5), generates approval forms for medium-to-high risk operations (with risk alerts, content preview, confirmation options), executes after human confirmation, and archives the entire process.

Core Capability: Shifting from "AI acts first, humans check later" to "humans approve for high-risk operations." L1-L2 operations execute automatically; L3 prompts user attention; L4-L5 requires human confirmation before execution.

Evaluation Criterion: Your Skill system has clear risk classification and human approval mechanisms, with a "AI request → human confirmation → execution" three-step closed loop.

This layer addresses the "trust boundary" problem. No matter how capable AI becomes, there are scenarios where it shouldn't make decisions alone. Human-in-the-loop isn't distrust of AI—it's necessary protection for business security.

Layer 9: Composition Skill — Multi-Skill Orchestration Benchmark

A single Skill solves one problem, but real businesses need multiple Skills collaborating to complete an entire workflow. Composition Skills orchestrate 5+ base Skills into an end-to-end business pipeline.

Typical Structure: Take the "Intelligent Data Query Dashboard" as an example—one sentence from the user → permission check → intent understanding → data query → statistical computation → chart rendering. 5 Skills collaborate: NL2Query (intent recognition) → Security-Guard (permission verification) → Data-Executor (secure query) → Data-Aggregator (aggregation computation) → Visualization-Renderer (chart rendering), orchestrated by a Gateway Skill.

Core Capability: One sentence triggers a complete pipeline of 5+ Skills, each Skill called as an independent component, the whole presenting as a unified business entry point.

Evaluation Criterion: Your Skill system has 5+ Skills linked through unified orchestration, and users can complete end-to-end business processes with just one natural language input.

This layer represents the transition "from tool to system." A single Skill is a hammer; a Composition Skill is an entire workshop. But composition is far more difficult than individual parts—interface protocols, data formats, error propagation, performance bottlenecks—each is an engineering challenge.

Layer 10: Closed-Loop Skill — End-to-End Business Closed Loop System

This is the highest form of Skill construction: 8+ Skills collaborating, covering the complete business closed loop from "understanding intent" to "archiving and precipitation," not only completing one task but also self-learning and continuous evolution.

Typical Structure: Take the "Enterprise Customer Intelligent Operations Assistant" as an example—8-step closed loop: understand intent → multi-source query → rule scoring → evidence validation → root cause location → human confirmation → execution archiving → visual output. 11 Skills collaborate, including 6 base components, 3 middleware, and 2 orchestrators.

Core Capability:

End-to-end coverage: complete chain from input to archiving, no "breakpoints"
Self-evolution: each execution's experience automatically precipitates into the knowledge base, subsequent executions become increasingly precise
Observability: full process audit trail, input/output and decision basis of each step traceable
Resilience: degradation strategies when a Skill fails, preventing entire pipeline collapse

Evaluation Criterion: Your Skill system covers the complete business closed loop (input → analysis → decision → execution → archiving), 8+ Skills collaborating, with self-evolution and observability capabilities.

This layer is the ultimate goal of Skill construction. Not building one or two useful tools, but constructing a business operating system that can run autonomously, evolve continuously, and is traceable and auditable.

How to Assess Your Skill Level

Based on the ten layers above, you can quickly assess your current Skill construction level:

Layer	Core Characteristic	Have You Reached This?
Layer 1	Pure prompt	You've written a reusable SKILL.md
Layer 2	With resources	Your Skill has references or scripts
Layer 3	Workflow	Your Skill has multi-step decision trees
Layer 4	Multi-Agent orchestration	You've used Phase-Orchestrator to schedule sub-Agents
Layer 5	Security governance	Your Skill system has security review mechanisms
Layer 6	Rule engine	Your Skill uses YAML configuration to drive scoring rules
Layer 7	Cross-validation	Your Skill cross-validates from multi-source data
Layer 8	Human-in-the-loop	Your high-risk operations require human approval
Layer 9	Composition orchestration	You've orchestrated a 5+ Skill complete pipeline
Layer 10	Business closed loop	You've built an 8+ Skill end-to-end closed loop system

Most teams stay at Layers 1-3. Reaching Layer 4 means you have genuine engineering capability. Layer 8 means your AI system has a "security baseline." At Layer 10, what you've built is no longer a Skill—it's an AI business operating system.

Upgrade Path from Layer 1 to Layer 10

You don't need to jump to Layer 10 in one go. A pragmatic upgrade path is:

Reach Layer 3 first: Master workflow design to ensure your Skill can complete full task processes
Then Layer 5: Add security governance to ensure your Skills operate within security boundaries
Then Layer 8: Introduce human-in-the-loop to ensure high-risk scenarios have human oversight
Finally aim for Layer 10: Orchestrate all components into a closed loop system

Each layer is a natural extension of the previous one. Layer 4 (orchestration) is the Agent-ified version of Layer 3 (workflow); Layer 5 (security) is a necessary supplement to Layer 4 (orchestration)—you can't let orchestrated Agents run wild; Layers 6-8 represent business depth—scoring, validation, and approval are all must-haves for enterprise applications; Layers 9-10 are the fruits of systems engineering—orchestrating all components into a unified whole.

Conclusion

Skill construction is not a simple "write a good prompt and you're done" endeavor, but a systematic engineering journey from single files to multi-Agent collaboration to business closed loops. The ten-layer classification isn't meant to create anxiety but to provide a clear growth path—where you are now, where to go next, and what new capabilities each layer requires.

Remember one principle: First make the Skill run, then make it run securely, finally make it run as a system. This is the core logic of the ten-layer framework, and the pragmatic path from concept to implementation.

Yu Zhaopeng is a financial AI expert who has published 24 professional books. Explore his open-source Skill frameworks: financial-ai-skills (104 financial AI skills), skill-framework (208-skill classification system), soe-compliant-office (20 SOE-compliant office Skills).

SOE Compliant Office Agent Skill System: 20 Skills Across 3 Domains

兆鹏于 — Sat, 04 Jul 2026 04:57:31 +0000

SOE Compliant Office Agent Skill System: Full-Stack Practice of 20 Skills Across 3 Domains

Introduction: Why State-Owned Enterprises Need Dedicated Compliance Office Skills

There are plenty of AI office tools on the market -- writing documents, making PPTs, creating reports, the basic four-suite can handle it all. But in the context of State-Owned Enterprises (SOEs), three fatal problems immediately emerge:

Compliance Gap: Party committee pre-review, "Three Majors and One Large" decisions, GB9704 document format -- generic tools have no understanding of these regulations
Audit Blind Spots: Who used AI to change what and when? No traceability means inspection teams get blank stares
Standards Disconnect: Xinchuang (domestic IT) requirements, domestic model adaptation, penetrative supervision -- generic solutions avoid all of these

We spent 6 months building 20 Skills across a three-domain architecture -- not a fine-tuned version of generic office tools, but purpose-built from zero for SOE needs.

Three-Domain Architecture: Document Operations + Compliance Security + Reporting Analysis

Domain	Skills	Phase
Document Operations	doc-formatter, doc-template, doc-compare, meeting-minutes, contract-check, red-letter	Phase 1 (8 skills, done)
Compliance Security	party-review, triple-major, gb9704-check, audit-trail, data-mask, poc-filter	Phase 2 (6 skills, done)
Reporting Analysis	ops-digest, risk-weekly, kpi-dashboard, var-analysis, peer-compare, forecast	Phase 3 (6 skills, done)

Three Key Differentiators

1. Built-in Compliance

Not "check compliance after writing" but "write within a compliance framework from the start." The Red-Header Document Generator Skill, for example, validates every step against national standards -- layout position, copy numbering, classification markings, urgency levels -- all with corresponding GB/T 9704 validation rules.

2. Audit Trail by Default

Every AI operation's complete chain is recorded locally in the .audit/ directory:

Who called which Skill and when
What raw data was input (archived after desensitization)
What intermediate results AI produced
What final content was output
Whether human confirmation was obtained

3. National Standards Ready

20 Skills incorporate 15 SOE-specific standard systems:

Standard	Skills	Implementation
GB/T 9704-2012	doc-formatter, gb9704-check, red-letter	147-item validation
SASAC EVA Assessment	ops-digest, kpi-dashboard	Automatic calculation
"Three Majors" Decision System	triple-major, party-review	Amount threshold trigger
SOE Procurement Compliance	contract-check	50 built-in rules
Penetrative Supervision	risk-weekly	Five-dimension framework
Xinchuang Requirements	All skills	Pure Python + domestic models

Technical Implementation: Zero API Cost + Pure Python + Millisecond Response

Key technical features:

Response Speed: All rules run in-memory, single judgment < 10ms
Zero API Cost: No LLM API calls for decisions, pure rule engine
Auditable: Every judgment records rule ID, threshold, and actual value
Configurable: Amount thresholds and rule switches via YAML config

`python

Core logic of the "Three Majors" determination Skill

def check_triple_major(item):
decisions = []
# Rule 1: Major decision -- investment exceeds 5% of net assets
if item.type == "investment" and item.amount > company.net_assets * 0.05:
decisions.append({
"rule": "Major Decision - Large Investment",
"threshold": f"5% of net assets ({company.net_assets * 0.05:.0f}万元)",
"actual": f"{item.amount:.0f}万元",
"action": "Must submit to Party Committee for pre-review"
})
return decisions
`

SOE Maturity Scoring: 1-5 Compliance Levels

Level	Description	Characteristics
SOE-1	Manual Compliance	Pure manual operation, compliance depends on personnel quality
SOE-2	Tool-Assisted	Generic office tools, compliance check via manual review
SOE-3	Built-in Rules	Compliance Skills with automatic validation, human confirmation
SOE-4	Audit Closed-Loop	AI full-chain traceability, compliance proactive alerts
SOE-5	Intelligent Compliance	AI auto-adapts to regulatory changes, compliance as default state

Most central SOEs currently sit between SOE-1 and SOE-2. This Skill system enables a direct leap to SOE-3 or even SOE-4.

Open Source & Cross-Promotion

Repository	Content	License	Link
soe-compliant-office	20 SOE compliance office Skills	MIT	https://github.com/yuzhaopeng-up/soe-compliant-office
financial-ai-skills	104 financial AI skills	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5 general business skills	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
skill-framework	208 skill taxonomy + L0-L4 framework	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12 zero-dependency financial H5 demos	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

Conclusion: From "Can Use AI" to "Use AI Compliantly"

SOE digitalization isn't a capability problem -- it's a compliance problem. It's not "can we do it" but "can it withstand inspection."

These 20 Skills aim to make compliance the default option -- every operation within a rule framework, every audit trace traceable, every output meeting national standards.

The leap from SOE-1 to SOE-4 hinges not on technology, but on making compliance a built-in process rather than a post-hoc check.

30+ Anti-Fraud Rules Engine: Real-Time Risk Control with Zero API Cost

兆鹏于 — Fri, 03 Jul 2026 12:57:04 +0000

30+条反欺诈规则引擎：零API费的实时风控系统

引言：传统风控的三个致命短板

金融风控领域有句老话："规则引擎人人有，真正好用的没几个。"传统风控规则引擎普遍存在三大痛点——

静态阈值，误报如雨。 单条规则写死一个数字，一旦业务变化，规则就成了摆设。50万的阈值拦住了正常大额贸易，却放过了49.9万的试探交易。

看单笔不见网络。 每笔交易独立评估，无法发现"5个账户把钱转给同一个人，再由这个人集中转走"的星型洗钱模式。团伙欺诈在单笔维度上完美合规。

调用外部API，成本与延迟双高。 每笔交易调一次第三方风控服务，按量计费，高峰期响应飙升到秒级，还伴随着数据外泄的风险。

如果有一个引擎，内置30+条规则，覆盖6大异常维度，纯Python运行无外部依赖，单笔评估耗时不到1毫秒，零API费用——这不是设想，而是已经落地的开源实现。

30+条规则全景图

反欺诈引擎的规则体系覆盖6大异常维度：金额异常、时间异常、频率异常、对方异常、路径异常、账户异常，合计30+条。以下展示4大核心类别，每类精选5条规则。

金额异常类

规则ID	规则名称	阈值说明	置信度
R001	大额交易超阈值	单笔 > 50万	0.85
R002	整数偏好异常	金额为万元整数倍	0.70
R003	金额突增	较历史均值增长 > 300%	0.80
R004	分散小额试探	同一对手方多笔 < 1万小额	0.75
R005	接近上限阈值	金额在限额的90%-100%之间	0.65

金额异常是最直观的信号，也是传统风控最早覆盖的维度。但传统做法往往只盯住"大额"这一个特征，忽略了金额背后的行为模式。R002"整数偏好异常"就是典型的行为特征——真实贸易中，金额通常精确到分（如487,632.58元），而欺诈交易的金额往往凑整（如500,000元），这种"整数偏好"在统计学上具有显著区分度。R003"金额突增"尤其值得注意——一笔突然比历史均值高出3倍以上的交易，即使绝对金额不大，也往往暗示账户已被接管。R004"分散小额试探"识别的是另一种攻击模式：先发多笔小额交易测试通道是否通畅，确认后再发起大额攻击。R005"接近上限阈值"针对刻意试探限额的行为，0.65的置信度虽低，但与其他规则组合后权重迅速放大。

时间异常类

规则ID	规则名称	阈值说明	置信度
T001	凌晨交易	时间在02:00-05:00之间	0.90
T002	节假日交易	中国法定节假日	0.75
T003	短时高频	1小时内交易 > 5次	0.85
T004	非常规时段	非工作时间(22:00-08:00)	0.70
T005	月末密集交易	月末3天内交易频率异常	0.60

T001"凌晨交易"置信度高达0.90，因为正常企业交易极少出现在凌晨2-5点，这个时段的交易几乎必定伴随异常动机。T002"节假日交易"和T004"非常规时段"看起来有重叠，实则不同——节假日关注的是"当天不该有交易"，非常规时段关注的是"这个时段不该有交易"，两者分别捕获周末加班和夜间操作的不同场景。T005"月末密集交易"看似置信度偏低，但在财务造假场景中，月末集中冲量的模式反复出现，与其他规则联动后能有效识别"做账式"欺诈。

频率异常类

规则ID	规则名称	阈值说明	置信度
F001	短时高频交易	1小时内 > 3笔	0.80
F002	交易频率突增	较日均频率增长 > 500%	0.85
F003	账户静默后激活	30天未交易后突然活跃	0.75
F004	对手方首次交易	新增对手方首笔交易	0.70
F005	交易时段过于规律	每隔固定分钟数交易	0.60

F003"账户静默后激活"是侦测账户被盗的关键规则——长期不用的账户突然活跃，大概率不是原持有人在操作。F002"交易频率突增"和F001"短时高频"形成梯度：F001看绝对数量（1小时内3笔），F002看相对变化（比日均高出5倍），两者互补覆盖不同基数账户。F005"交易时段过于规律"检测脚本化攻击，人工操作不可能每隔精确的N分钟触发一笔交易，但自动化脚本会。

对方异常类

规则ID	规则名称	阈值说明	置信度
C001	高风险地区对手方	制裁/洗钱高发地区	0.95
C002	新客户首笔大额	客户注册 < 30天首笔 > 10万	0.85
C003	敏感行业对手方	房地产/博彩/虚拟货币	0.80
C004	壳公司特征	对手方为空壳公司	0.75
C005	交易对手信用评级低	对方评级为B以下	0.80

C001置信度0.95，是所有规则中最高的一条。涉及制裁/洗钱高发地区的交易，几乎不需要二次验证即可拦截。C002"新客户首笔大额"捕获的是"注册即大额"的高风险模式——正常客户首笔交易通常谨慎，上来就10万以上的高度可疑。C004"壳公司特征"需要结合知识图谱的关联分析才能判定，引擎通过交叉匹配工商信息和交易行为实现自动识别。C005"交易对手信用评级低"实现了反欺诈与信用审批的规则级联动。

除了上述4大类20条核心规则，引擎还包含路径异常类5条（资金快进快出、大额分散转入、交易链路闭环、跨境资金中转、地下钱庄特征）和账户异常类5条（信息变更后交易、登录IP突变、多账户关联、长期闲置后启用、风险名单命中），合计30+条全覆盖。

规则引擎架构：从规则配置到处置建议

引擎的核心设计遵循一条清晰的五阶段管线：规则配置 → 并行评估 → 加权评分 → 风险等级 → 处置建议。

规则配置层：每条规则以Python数据类定义，包含规则ID、名称、阈值、置信度、权重五个要素。修改阈值只需改配置，无需动代码。这种"配置即规则"的设计，使得业务人员也能参与规则调优，不必依赖开发团队排期。

并行评估层：所有规则无依赖关系，可并行执行。引擎接收一笔交易数据后，同时对30+条规则进行匹配检查，命中则记录规则ID和置信度。无依赖的另一个好处是新增规则不影响已有规则的评估逻辑，扩展性天然具备。

加权评分层：最终风险分值 = Σ(命中规则置信度 × 规则权重) / Σ(规则权重) × 100。不是简单的"命中几条算几分"，而是置信度加权——高置信度规则（如C001的0.95）对最终分数的贡献远大于低置信度规则（如T005的0.60），避免低置信度规则拉高误报。

风险等级层：评分映射到四个等级——0-25为低风险（绿色，正常通过），26-50为中风险（黄色，加强监控），51-75为高风险（橙色，人工复核），76-100为极高风险（红色，立即拦截）。

处置建议层：不同风险等级对应不同建议行动，从"正常通过"到"立即拦截+冻结账户+通知风控专员"逐级升级。处置建议与风险等级一一绑定，杜绝"高处轻判"。

关联图谱突破：从"看单笔"到"看网络"

单笔交易维度的规则再精密，也看不到资金流转的全貌。这是传统规则引擎最大的盲区。

知识图谱风控模块（knowledge_graph.py, 24KB）与规则引擎（risk_engine.py, 13KB）配合，实现了三种关键图谱模式的检测：

星型转账检测：5个以上账户向同一中心账户分散转入，再由中心账户集中转出——典型的"漏斗型"洗钱结构。对应规则P001"大额分散转入"（置信度0.90）和P005"地下钱庄特征"（置信度0.95，是路径异常类最高的一条）。

循环转账检测：资金在A→B→C→A之间形成闭环，没有真实的商业目的。对应规则P003"交易链路闭环"（置信度0.90）。循环转账是担保链风险的核心形态——互保企业之间资金循环，一旦某一环断裂，整条链崩溃。

链式转账检测：资金沿A→B→C→D逐级传递，每级抽取一定比例。结合担保链分析，能识别出互保企业圈中的风险传染路径。链式传递的隐蔽性在于每一跳都像一笔正常交易，只有拉出完整链路才暴露异常。

三者的共同点是：在单笔维度上完全合法。只有把交易网络画出来，才能看见异常的拓扑结构。这正是知识图谱赋予规则引擎的"上帝视角"。

信用评分联动：反欺诈 + 信用审批协同作战

反欺诈和信用审批是风控的一体两面：欺诈是"故意骗"，信用是"还不上"。引擎设计了两者的联动机制。

反欺诈引擎输出的风险评分，作为信用审批的输入之一。信用审批基于5维评分（资产负债率20分、流动比率20分、利润率20分、经营年限20分、行业风险20分），输出0-100的信用评分和AAA-D十个等级。十个等级对应不同的信用系数和抵押折扣——AAA级信用系数0.5、抵押折扣0.8，往下逐级衰减，C级归零直接拒绝。

更深层的是财务预警联动。信用审批引擎内置杜邦分析和Altman Z-score两个经典模型：

杜邦分析将ROE拆解为净利率 × 资产周转率 × 权益乘数三个因素，定位盈利能力、运营效率、财务杠杆中的薄弱环节。ROE > 20%为优秀，低于10%即进入较差区间。

Z-score模型通过营运资本/总资产、留存收益/总资产、EBIT/总资产、股东权益/总负债、营业收入/总资产五个变量加权计算：Z = 1.2X1 + 1.4X2 + 3.3X3 + 0.6X4 + 1.0X5。Z > 2.99为安全区，1.81-2.99为灰色区需关注，Z < 1.81进入破产区。

当反欺诈引擎标记一笔交易为"高风险"且信用审批显示Z-score处于灰色区时，系统自动升级风险等级——这不再是单维度的规则触发，而是跨模型的协同预警。

企微实战：Template Card推送风控预警

检测到风险后，如何第一时间通知风控人员？引擎内置了企业微信Template Card推送。

from fraud_detection.wecom_integration import send_fraud_alert

result = engine.detect(
    amount=500000,
    transaction_type="转账",
    counterparty="新客户",
    transaction_time="凌晨2点"
)

if result.level in ("高", "极高"):
    send_fraud_alert(result, webhook_url="https://qyapi.weixin.qq.com/...")

Template Card展示风险评分、风险等级、触发规则列表和建议行动，风控人员无需登录系统即可在企微群中直接看到告警详情并做出处置决策。对于"极高"等级的交易，建议立即拦截并冻结账户，等待人工复核后再放行。推送消息支持颜色标识——红色对应极高、橙色对应高、黄色对应中，风控人员扫一眼就知道优先级。

零API费 + 毫秒响应：技术实现要点

为什么能做到零API费用？因为规则引擎是纯Python实现，无任何外部API调用。

不调用第三方风控服务：所有规则逻辑内嵌于fraud_engine.py（25KB）
不依赖外部数据库：规则配置、阈值、置信度均以Python数据类定义
不需要网络请求：评分计算完全在本地完成

为什么能做到毫秒级响应？因为没有网络I/O。

30+条规则并行评估，纯CPU计算
单笔交易评估耗时 < 1ms（实测数据）
批量100笔交易评估耗时 < 50ms
可轻松嵌入交易链路的同步校验环节

对比外部API方案：按每笔0.01元计费，日处理10万笔交易，月费用3万元；而纯Python引擎的边际成本为零。延迟方面，外部API往返通常100-500ms，本地引擎不到1ms，快了两个数量级。

一个典型的调用示例：

from fraud_detection import FraudDetectionEngine

engine = FraudDetectionEngine()
result = engine.detect(
    amount=500000,
    transaction_type="转账",
    counterparty="新客户",
    transaction_time="凌晨2点",
    account_history={"avg_amount": 80000, "dormant_days": 45}
)

print(f"风险评分: {result.score}")      # 85
print(f"风险等级: {result.level}")      # 极高
print(f"触发规则: {result.rules}")      # [R001, T001, C002, F003]
print(f"建议行动: {result.actions}")    # [立即拦截, 人工复核, 冻结账户]

四行代码，零配置，零费用，毫秒出结果。开箱即用。

总结：从事后补救到实时拦截

传统风控的困局在于：规则是静态的、视角是单笔的、响应是滞后的。当风控团队发现一笔异常交易时，资金可能已经转了三手，消失在关联账户的网络中。

30+条规则的并行评估解决了"静态"问题——多维度交叉验证，单条规则的误报被其他规则的置信度稀释。知识图谱的关联分析解决了"单笔"问题——从节点到网络，看见资金的拓扑流动。杜邦分析和Z-score的联动解决了"单维度"问题——反欺诈与信用审批协同，不再各自为战。纯Python本地引擎解决了"滞后"问题——交易到达的那一刻就完成评估，不等待任何外部服务。

从"事后补救"到"实时拦截"，差的不是理念，而是一个开箱即用、零成本、毫秒响应的引擎。

Agent Skills 开源生态

本文涉及的技能和框架已开源，欢迎 Star / Fork / PR：

仓库	内容	协议	链接
financial-ai-skills	104个金融AI技能，零API费	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5个通用Agent技能(评分引擎/证据链/数据聚合/可视化/NL2Query)	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	5层集群通信技能(L1-L5)	Apache 2.0	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	208技能分类体系+L0-L4框架+YAML模板	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12个零依赖金融H5演示	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

AI生成

Enterprise Due Diligence Agent: AI Reports for 60+ Real Companies

兆鹏于 — Fri, 03 Jul 2026 12:43:21 +0000

企业尽调智能体实战：60+真实企业的AI尽调报告

从5天到10分钟：AI如何重构企业尽调

企业贷前尽调，银行和金融机构最头疼的环节。一位信贷经理曾这样描述他的工作：打开天眼查查工商信息，切到Wind拉行情，再打开百度搜新闻，最后把散落在七八个系统里的数据拼进Word模板。一家企业，至少5天。如果碰上集团客户、关联方众多的，两周起步。

一家支行行长曾无奈地说："25个客户经理，每个人做的尽调报告格式都不一样。同样的企业，A经理评'低风险'，B经理评'中等风险'，谁对谁错无从判断。"问题的根源不是人的能力差异，而是工具链的碎片化——数据散落在不同系统里，没有统一入口，也没有标准化的采集流程。

我们调研了12家金融机构的尽调流程，发现三个共性痛点：信息散落（数据分布在6-10个系统中）、耗时漫长（单家企业5-10个工作日）、质量参差（依赖个人经验，无标准化流程）。

本文记录的，是一个用AI Agent解决这个问题的实战项目——企业尽调引擎v5.0。它不是概念验证，不是Demo，而是在60+家真实企业上跑通的生产级系统。

技术架构：多源数据整合的数据流

尽调的核心难题不是"分析"，而是"采集"。一家上市公司的完整画像，需要从至少6个异构数据源拉取信息。传统方式是人肉Copy-Paste，我们的方案是用Agent自动编排数据流：

用户输入 "美的集团"
    │
    ▼
┌─────────────────────────────────┐
│  Step 1: 股票代码查询            │
│  联网搜索 → 000333.SZ           │
└──────────────┬──────────────────┘
               │
    ┌──────────┴──────────┐
    ▼                     ▼
┌─────────┐         ┌──────────┐
│ Step 2a │         │ Step 2b  │
│ 实时行情 │         │ 新闻舆情  │
│ ifind   │         │ 联网搜索  │
└────┬────┘         └─────┬────┘
     │                    │
     └─────────┬──────────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│Step 3a │ │Step 3b │ │Step 3c │
│工商信息 │ │风险扫描 │ │估值指标 │
│  MCP   │ │  MCP   │ │  MCP   │
└───┬────┘ └───┬────┘ └───┬────┘
    │          │          │
    └──────────┼──────────┘
               │
               ▼
┌─────────────────────────────────┐
│  Step 4: 舆情分析 + 综合评分      │
│  多源交叉验证 → 生成尽调报告      │
│  输出: JSON(5KB) + Markdown(4KB) │
└─────────────────────────────────┘

这个数据流的核心设计原则是并行采集、串行推理。Step 2的行情和舆情可以并行获取，Step 3的三个MCP调用也可以并行，但Step 4的综合评分必须等所有数据到齐后才能做交叉验证。这种设计把端到端耗时压到了10分钟以内。

另一个关键设计是渐进式降级：如果MCP工具不可用（比如企业是非上市公司），引擎会跳过行情和估值模块，仅返回工商+风险+新闻的"基础版"报告，而不是直接报错退出。这一设计在实际使用中至关重要——我们的60+企业样本中，有11家是非上市企业，如果要求所有数据源齐备才能出报告，这11家就会被拒之门外。

五大能力详解

1. 股票代码查询

输入企业名称，自动搜索匹配股票代码。比如输入"美的集团"，引擎通过联网搜索拿到000333.SZ。这个步骤看似简单，却是后续所有数据获取的前提——行情、估值、历史走势全部依赖股票代码。对于非上市企业，引擎会标记stock_code: null并跳过相关模块。在实际测试中，股票代码查询的成功率超过98%，少数失败案例主要是名称变更（如"格力地产"更名为"珠免集团"）尚未被搜索引擎索引。

2. 实时行情数据

通过ifind接口获取实时股价、涨跌幅、成交量、换手率等指标。这些数据直接写入报告的"行情数据"章节，避免分析师手动从交易软件抄录。更重要的是，行情数据与后续的估值指标做交叉验证——如果PE_TTM显示14倍但股价异常波动，报告会标注"数据一致性待确认"。

3. 企业新闻舆情

联网搜索获取企业最新新闻，引擎对新闻做情感分析后输出舆情等级（正面/中性/负面）和舆情得分（0-100）。这不是简单的关键词匹配，而是基于上下文的语义判断。当正面信号和风险信号同时出现时，报告会分别列出，而非简单抵消。一条"美的集团海外营收创新高"和一条"美的集团遭反倾销调查"同时出现时，舆情得分不会因为一正一负就打平，而是会标注"增长信号与政策风险并存"。

4. 工商信息检索

通过MCP工具company_business_info获取法人、注册资本、股东结构、高管团队、行业分类等工商登记信息。这些数据来自官方工商数据库，比人工在天眼查上截图更准确，也有结构化优势——股东持股比例可以直接用于关联方分析。美的集团的工商信息返回显示，第一大股东美的控股有限公司持股30.94%，这种结构化数据可以直接输入关联方分析模型。

5. 风险扫描

通过MCP工具company_risk_info扫描被执行信息、行政处罚、欠税记录、经营异常名录。这是尽调中最关键的环节，也是最容易遗漏的——传统尽调中，信贷经理往往只查一两个维度就交差。引擎把四类风险全部扫一遍，任何一项有记录就直接标红。在60+企业的实测中，我们捕获了3家存在行政处罚记录的企业和1家曾有经营异常标记的企业——这些信息如果用人肉搜索，大概率被遗漏。

实战案例：美的集团尽调报告生成

以美的集团为例，展示完整的尽调流程：

输入："美的集团"（仅企业名称，无其他预设信息）

引擎运行过程：

步骤	动作	耗时	结果
Step 1	联网搜索股票代码	~3s	000333.SZ
Step 2a	ifind获取实时行情	~5s	收盘81.81，涨1.78%
Step 2b	搜索企业新闻	~4s	3条正面，0条风险
Step 3a	MCP获取工商信息	~2s	法人方洪波，注册资本7010万
Step 3b	MCP风险扫描	~2s	风险等级：低，0条执行记录
Step 3c	MCP获取估值指标	~2s	PE 14.08，PB 2.68，市值6224亿
Step 4	综合评分+报告生成	~3s	综合评分65.0，中等风险

总耗时：约20秒（含网络延迟。批量60+企业时，并行处理平均每家不到2分钟）

JSON报告关键字段：

{
  "basic_info": {
    "company_name": "美的集团",
    "stock_code": "000333.SZ",
    "legal_person": "方洪波",
    "registered_capital": "7010万人民币",
    "industry": "电气机械和器材制造业",
    "status": "存续"
  },
  "market_data": {
    "close": 81.81,
    "pct_change": 1.78,
    "turnover_rate": 0.57,
    "volume": 38926688
  },
  "risk_scan": {
    "risk_level": "低",
    "executed_cases": 0,
    "penalties": 0,
    "tax_arrears": 0,
    "abnormal_operations": 0,
    "risk_summary": "企业经营正常，无重大风险信号"
  },
  "assessment": {
    "overall_score": 65.0,
    "risk_level": "中等风险",
    "recommendation": "建议补充材料"
  }
}

注意一个细节：风险扫描显示"低"，但综合评估却是"中等风险"。这是因为舆情得分65拉低了综合评分——市场层面的不确定性被引擎捕获并反映在最终结论中。这种"多源数据交叉验证，取最严结论"的策略，是避免AI幻觉的关键设计。单一数据源说"没问题"不够，必须在多个维度交叉确认后才能给出"低风险"的判断。

60+企业验证：跨行业批量尽调

我们在60+家真实企业上验证了引擎的稳定性。这些企业覆盖四个主要行业：

行业	企业数量	上市/非上市	代表企业
制造业	22	18/4	美的集团、比亚迪、宁德时代
金融业	15	12/3	招商银行、中信证券
消费品	14	10/4	河南双汇(000895)、贵州茅台
科技/互联网	11	7/4	科大讯飞、用友网络

批量运行统计数据：

全量完成率：93.3%（56/60家生成完整报告，4家因企业名称歧义需人工确认）
平均报告生成耗时：1分48秒/家
JSON平均体积：5.2KB，Markdown平均体积：3.8KB
运行日志：engine_log.txt共155KB，记录了完整的调用链和异常处理过程

最典型的失败模式是企业名称歧义——比如输入"华谊"，可能是华谊兄弟也可能是华谊集团。v5.0的处理方式是列出所有匹配项让用户选择，而非猜测一个结果直接返回。这种"宁可多问一次，不可给错数据"的设计哲学，在金融场景下尤为重要。

从行业维度看，制造业的尽调报告完整度最高（22家中21家返回完整6维数据），因为上市比例高、公开信息丰富。金融业的特殊之处在于银行类企业没有传统意义上的"PE/PB"估值逻辑，引擎会自动识别并调整评估维度。

进化历程：从静态Demo到生产引擎

这个项目不是一步到位的。回顾5个版本的迭代，能清晰看到一条从"能跑"到"可信"的路线：

版本	日期	核心改进	局限
v1.0	2026-04-28	静态数据，验证报告模板	全部硬编码，不可复用
v2.0	2026-05-03	多数据源整合，告别硬编码	每家需手动配置数据源
v3.0	2026-05-04	批量尽调引擎+API服务	数据获取仍需人工介入
v4.0	2026-05-10	ArkClaw集成，Agent自动编排	依赖特定API，费用高
v5.0	2026-05-15	联网搜索+MCP双通道，零API费用	MCP需特定环境

最关键的跃迁发生在v4.0到v5.0之间。v4.0依赖Ark API获取数据，每次调用都有费用，60家企业跑一轮的成本不低。v5.0引入了联网搜索作为主数据通道，MCP工具作为补充——零API费用，且数据来源更透明可审计。对于金融机构而言，"零费用"不只是省钱，更是合规加分项——没有第三方数据采购，就没有供应商风险。

v1.0到v3.0的17天里，团队解决的其实是"数据从哪来"的工程问题。v2.0把硬编码改成配置化，v3.0加了批量处理能力，但都需要人工搬运数据到指定位置。直到v4.0引入Agent自动编排，才真正实现了"输入企业名，输出完整报告"的闭环体验。

多源验证：对抗AI幻觉的工程解法

AI生成内容最大的信任危机是幻觉——模型自信地编造一个不存在的数据，看起来比真实数据还像真的。在尽调场景下，一条虚构的风险记录可能导致数亿贷款的误判。

我们的解法不是"让模型更聪明"，而是用数据流设计来物理层面阻断幻觉：

数据与推理分离：报告中的每一个数据点都标注了数据源（ifind/MCP/联网搜索），分析结论仅在数据到齐后生成，禁止模型"先编结论再找证据"
交叉验证标红：同一指标从不同数据源获取后做一致性比对，偏差超过阈值直接标红
渐进式降级：数据源不可用时跳过对应模块，而非降级到"模型猜测"
输出可审计：每份JSON报告附带data_sources字段，记录每个数据点的来源和获取时间，支持全链路溯源

这套机制在60+企业验证中表现稳定：没有发现一例虚构数据的情况。原因不是模型变乖了，而是数据流设计让幻觉无处发生——你不给它猜测的机会，它就不会猜。155KB的engine_log.txt记录了每一次API调用的入参出参，任何数据点都可以追溯到原始请求，这是传统人工尽调无法实现的审计粒度。

落地启示：从"辅助工具"到"基础设施"

企业尽调引擎跑通60+家企业的实践，验证了一个判断：AI在金融场景的价值不在于"替代人做分析"，而在于把数据采集和标准化做成基础设施。

传统尽调的5天里，4.5天花在数据采集和格式整理上，真正的分析判断只有半天。AI把这4.5天压缩到10分钟，分析师可以把精力释放到真正需要人类判断的环节——行业趋势解读、关联方利益分析、谈判策略制定。

更深远的影响是标准化。60+企业、统一的JSON+MD双格式输出，意味着尽调结果首次具备了可比性。不同分析师做出的报告不再各写各的，而是同一口径下的差异分析——这个价值远超"省几天时间"本身。

从工具到基础设施的跨越，标志不是技术有多复杂，而是能不能被其他系统依赖。当评分引擎可以稳定消费尽调引擎的JSON输出，当风控系统可以把风险扫描结果作为规则触发条件，尽调才真正从"写报告"变成了"造数据"——而这个数据，是整个金融风控链路的起点。

Agent Skills 开源生态

本文涉及的技能和框架已开源，欢迎 Star / Fork / PR：

仓库	内容	协议	链接
financial-ai-skills	104个金融AI技能，零API费	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5个通用Agent技能(评分引擎/证据链/数据聚合/可视化/NL2Query)	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	5层集群通信技能(L1-L5)	Apache 2.0	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	208技能分类体系+L0-L4框架+YAML模板	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12个零依赖金融H5演示	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

AI生成

4A Enterprise Architecture + TOGAF: How to Guide Agent Skill Design

兆鹏于 — Fri, 03 Jul 2026 12:43:07 +0000

4A企业架构+TOGAF如何指导Agent Skill设计

引言：AI Skill设计的"巴别塔"困局

当下的AI Agent生态，正陷入一种似曾相识的混乱。

去年帮一家保险公司梳理Agent技能库，发现100多个Skill横七竖八地堆在一起——有的直接调API，有的内嵌业务逻辑，有的把数据获取和分析揉成一团。问架构师这些Skill怎么分类，回答是"按安装顺序排的"。再问两个Skill之间数据怎么流转，回答是"各写各的"。一个股票监控Skill自己爬数据、自己做分析、自己发消息，三件事耦合在同一个脚本里。换一个场景想复用其中的分析逻辑？做不到，只能重写。

这不是个例。几乎所有率先部署AI Agent的企业都面临同样的困境：Skill越堆越多，越堆越乱。缺乏统一的能力域划分，缺乏标准化的数据接口，缺乏清晰的组合规则，缺乏可复用的构建块沉淀。

听起来很熟悉？没错——这正是企业架构在20年前要解决的问题。当年企业信息化的混乱，和今天AI Skill的混乱，本质上是一回事：没有架构约束的开发，必然走向无序。

4A企业架构（业务架构BA、数据架构DA、应用架构AA、技术架构TA）加上TOGAF的构建块思想，为Agent Skill设计提供了一套经过验证的方法论。本文试图建立这二者之间的映射框架，并用实际案例说明其可行性。

4A映射框架：四个问题驱动Skill设计

企业架构的核心是四个问题：做什么（BA）、数据怎么流（DA）、用什么组合（AA）、底层怎么支撑（TA）。这四个问题同样适用于Skill设计。

┌───────────────────────────────────────────────────────────────┐
│                  4A → Skill 映射框架                           │
├───────────────────────────────────────────────────────────────┤
│                                                               │
│  BA 业务架构                                                  │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │ Skill的业务能力域划分、价值链映射                          │  │
│  │ → 回答"这套Skill体系解决什么业务问题"                      │  │
│  └────────────────────────────┬────────────────────────────┘  │
│                               │                               │
│  DA 数据架构                  │                               │
│  ┌────────────────────────────▼────────────────────────────┐  │
│  │ Skill的数据流、信息交换标准                               │  │
│  │ → 回答"Skill之间数据怎么流转、用什么格式"                  │  │
│  └────────────────────────────┬────────────────────────────┘  │
│                               │                               │
│  AA 应用架构                  │                               │
│  ┌────────────────────────────▼────────────────────────────┐  │
│  │ Skill的组合关系、依赖图谱                                 │  │
│  │ → 回答"哪些Skill可以组装、依赖关系是什么"                   │  │
│  └────────────────────────────┬────────────────────────────┘  │
│                               │                               │
│  TA 技术架构                  │                               │
│  ┌────────────────────────────▼────────────────────────────┐  │
│  │ Skill的运行时、工具链、基础设施                            │  │
│  │ → 回答"Skill跑在什么环境上、需要哪些依赖"                   │  │
│  └─────────────────────────────────────────────────────────┘  │
│                                                               │
└───────────────────────────────────────────────────────────────┘

用表格更清晰地展示每一层的核心映射关系：

架构层	核心问题	Skill映射	示例
BA 业务架构	Skill解决什么业务问题？	按能力域分组：协同办公、内容生产、数据智能、系统运维等	"数据智能"域包含客户画像、股票监控、财务智能等Skill
DA 数据架构	Skill间数据如何流转？	统一数据格式(Markdown/JSON)，标准化数据源→转换→消费管道	搜索Skill输出Markdown，报告Skill消费Markdown生成PDF
AA 应用架构	Skill如何组合复用？	构建依赖图谱，上层组合下层，同类可替换	日报Skill = 搜索 + 摘要 + PDF转换 + 消息推送
TA 技术架构	运行环境与基础设施？	运行时(Node.js/Python/Bash)、外部API、认证体系	飞书Skill依赖OAuth 2.0，股票Skill依赖行情API

BA：从业务能力域出发

BA层的核心工作是"能力域划分"。不要按技术实现分类Skill，要按业务价值分类。一个"协同办公"能力域下面，飞书日程、企业微信待办、钉钉审批虽然对接不同平台，但解决的是同一个业务问题。反过来，"飞书套件"把日历、文档、消息、任务捆在一起，看似整齐，却掩盖了能力域的本质。

正确的做法是先画出业务能力地图，再把Skill填进去。缺失的能力域意味着需要新建Skill，同一域内的重叠Skill意味着需要合并或分层。实践中我们发现，7个能力域（协同办公、内容生产、数据智能、系统运维、开发工具、消息通道、娱乐生活）足以覆盖70+个Skill的分类。

DA：让数据在Skill间自由流动

数据架构解决的是Skill之间的"对话协议"。两个Skill要协作，必须约定数据格式。在实践中我们确立了几条规则：文档类输出统一用Markdown，API数据交换统一用JSON，时间字段统一用ISO 8601，用户标识统一用开放平台ID。同时识别出三类数据角色：数据源Skill（搜索、行情接口）、数据转换Skill（摘要、画像生成）、数据消费Skill（报告推送、告警通知）。数据流向必须是"源→转换→消费"，不允许消费Skill直接做原始数据采集——这跟数据仓库的ODS→DWD→ADS分层是一个逻辑。

AA：构建可组装的Skill图谱

应用架构关注的是Skill之间的组合与依赖。核心原则是：上层Skill必须组合下层Skill，绝不重复实现下层逻辑。一个股票监控Skill自己写爬虫——违反了这条原则。正确做法是依赖"搜索Skill"获取数据，自己只负责分析逻辑。这样当搜索能力升级（比如换了更快的API），所有上游Skill自动获益。

依赖分为强依赖和弱依赖。强依赖意味着Skill无法独立运行（如报告Skill依赖PDF转换），弱依赖意味着功能降级但可用（如故障排查Skill对平台套件的依赖是可选的）。在实际部署中，强依赖必须标注在Skill元数据中，弱依赖可以按环境条件加载。

TA：基础设施决定Skill的上限

技术架构层容易被忽视，却决定了整个Skill体系的天花板。需要明确四个子层：运行时层（Node.js网关、Python脚本引擎、Bash命令行、浏览器自动化）、工具链层（Git、tmux、curl、Chromium）、基础设施层（Skill注册中心、记忆系统、消息路由器）和外部集成层（飞书/企微/钉钉/GitHub等开放平台）。每一层的选型都影响Skill的可移植性和可扩展性。比如，如果一个Skill强依赖Chromium的特定版本，它在无头服务器上的部署就会受限。

TOGAF构建块思想：从原子能力到多智能体联动

TOGAF将架构元素分为架构构建块（ABB）和解决方案构建块（SBB）。映射到Skill体系：ABB是原子级Skill，不可再分；SBB是组合Skill，由多个ABB组装而成。更进一步，多个SBB可以再次组合，形成解决特定行业场景的"组合的组合"；而跨Agent之间的协同编排，则构成了多智能体联动场景。

┌─────────────────────────────────────────────────────────┐
│  ABB → SBB → 组合的组合 → 多智能体联动                    │
│                                                         │
│  ABB(基础Skill): 消息发送、搜索、文件读写...              │
│       │                                                 │
│       ▼ 组合                                            │
│  SBB(组合Skill): 任务流编排、文档管道、故障诊断...         │
│       │                                                 │
│       ▼ 再组合                                          │
│  业务域Skill: 股票监控、日报生成、客户画像...              │
│       │                                                 │
│       ▼ 跨Agent编排                                     │
│  多智能体场景: 投研生产线、编码军团、故障响应中心...        │
└─────────────────────────────────────────────────────────┘

以"投研生产线"为例：L1层的搜索Skill和行情Skill是ABB；L2层的报告模板引擎是SBB；L3层的投研报告生成是业务域Skill；L4层面，投研报告+股票监控+客户画像+日报推送组成完整的生产线，多个Agent各司其职。

构建块思想的核心价值是复用。统计表明，一个核心L1 Skill（如消息推送）被5个以上的L3 Skill依赖。如果没有构建块分层，每个L3 Skill都自建消息推送逻辑，改一个通知格式就要改5个地方。有了构建块，改一次，全部受益。

L0-L4五层分类体系

结合4A思想和构建块复用原则，我们设计了L0到L4的五层分类模型：

┌───────────────────────────────────────────────────────────┐
│  L4 多智能体联动  │ 编码军团、投研生产线、故障响应中心      │
│                   │ 跨Agent协作、分布式执行                   │
├───────────────────┼───────────────────────────────────────┤
│  L3 业务域编排    │ 股票监控、日报生成、客户画像、投研报告   │
│                   │ 完整业务流程、有状态管理                  │
├───────────────────┼───────────────────────────────────────┤
│  L2 组合Skill     │ TaskFlow引擎、文档管道、渠道配置         │
│                   │ 跨域集成、可复用模式                     │
├───────────────────┼───────────────────────────────────────┤
│  L1 基础Skill     │ 飞书套件、企微套件、搜索、GitHub...      │
│                   │ 单域封装、可直接调用外部API              │
├───────────────────┼───────────────────────────────────────┤
│  L0 原子能力      │ 浏览器、Web搜索、文件读写、Shell执行...  │
│                   │ 工具原语、不嵌入任何业务逻辑             │
└───────────────────┴───────────────────────────────────────┘

层级	数量	复用度	开发成本	行业属性
L0	~15	极高	低	全行业通用
L1	~50	高	低-中	全行业通用
L2	~8	中	中	全行业通用
L3	~6	低	高	行业专用
L4	~2	低	极高	视场景而定

一个关键规则：L3及以上Skill必须依赖L2和L1，绝不重复实现下层逻辑。L0层绝不嵌入业务逻辑。这保证了85%的Skill是跨行业通用的，行业特殊性只体现在L3和L4层。

无边界信息流：Skill间的信息破壁

TOGAF的III-RM参考模型有一个核心愿景：无边界信息流——信息应该在正确的时间、以正确的格式、交付给需要的人或系统，不受组织边界和技术边界的限制。

映射到Skill体系，这意味着：一个分析结果不应该被锁在生成它的Skill内部，而应该能被任何需要它的Skill消费。具体落地需要三件事：

第一，统一数据格式。 Markdown作为文档标准，JSON作为API标准，ISO 8601作为时间标准。任何Skill的输出都应符合这三项标准之一。

第二，解耦数据源与数据消费。 搜索Skill只负责"获取"，分析Skill只负责"加工"，推送Skill只负责"送达"。一个典型的信息流是：外部数据源→搜索层→分析层→分发层。搜索Skill获取新闻，摘要Skill提炼要点，报告Skill生成简报，消息Skill推送到飞书或企微——四个Skill各司其职，数据沿管道流动。

第三，记忆层打通跨会话壁垒。 很多Skill的输出需要持久化。加入记忆层（长期记忆文件、每日日志）后，今天的分析结果可以成为明天的上下文。一次投研报告中的行业判断，下次生成同类报告时自动引用。这是更高层次的无边界——时间维度的信息流。

实战案例：四个垂直行业的Skill蓝图

4A架构不是纸上谈兵。以下展示教育、医疗、法律、地产四个行业的Skill设计思路，每一条都遵循BA→DA→AA→TA的自上而下推导。

教育行业：课程编排与学情追踪

BA层识别出三个核心能力域：课程管理、学情追踪、教学资源。DA层的数据标准是：课程用JSON Schema描述，学习进度用时间序列数据，作业用Markdown。AA层组合方案：学情追踪Skill = 飞书多维表格（数据源）+ 摘要Skill（分析）+ 消息推送（通知家长）。TA层需要日历API集成和分数采集接口。

医疗行业：健康监测与就诊编排

BA层能力域：健康监测、预约管理、病历解读。DA层：健康指标用FHIR标准格式，预约数据走医院HIS接口，病历文本用Markdown存储。AA层：健康监测Skill = 可穿戴设备数据源 + 阈值告警引擎(L2可复用) + 企微卡片推送。TA层需要医疗设备对接和隐私合规（数据脱敏是前提）。

法律行业：合同审查与案例检索

BA层能力域：合同审查、法规检索、案例分析。DA层：合同用结构化JSON描述风险条款，法规数据用法条编号索引，案例用Markdown摘要。AA层：合同审查Skill = 文档解析(L1) + 风险识别引擎(L2) + 修订建议生成(L3)。这个场景的价值在于，L2层的风险识别引擎可以同时服务合同审查和合规检查两个L3 Skill，体现构建块复用。

地产行业：房源匹配与市场分析

BA层能力域：房源搜索、市场分析、贷款计算。DA层：房源用GeoJSON描述位置属性，行情数据用时间序列，贷款参数用结构化JSON。AA层：市场分析Skill = 地图POI搜索(L1) + 趋势分析(L2) + 区域报告生成(L3)。其中L2趋势分析引擎与金融行业的股票趋势分析同构，可跨行业复用。

四个行业的共性在于：L0和L1层完全通用（搜索、消息、文档、地图），L2层的模式引擎（告警、分析、模板）高度可复用，行业差异集中在L3层的业务规则和L4层的协作模式上。这正是4A架构的价值所在——用分层隔离变化，用构建块沉淀复用。

回过头看，一个常见的误区是"先写Skill再想架构"。这种做法在Skill数量很少时看不出问题，一旦超过20个，就会遇到开头描述的困境：重复建设、数据孤岛、组合困难、难以演化。正确做法是"架构先行"：先做完BA能力域梳理，再开发Skill；先定DA数据格式标准，再做跨Skill集成；先画AA依赖图，再决定组合方式；先确认TA运行环境，再选技术路线。先慢后快，整体效率反而更高。

还有一个实际建议：建立Skill注册中心。每个Skill在注册时必须声明四项元数据——所属能力域(BA)、数据输入输出格式(DA)、依赖的其他Skill(AA)、需要的运行时环境和API密钥(TA)。这四项声明就是Skill的"架构身份证"，有了它，Skill的发现、复用、替换才有据可依。没有注册中心的Skill体系，就像没有资产目录的仓库——东西都在，但谁也找不到。

总结：从IT治理到AI治理

企业架构思想从IT时代传承到AI时代，其核心没有变：用结构化思维对抗复杂性。4A告诉我们要分层思考（从业务到技术），TOGAF告诉我们要用构建块组合（而非从零开始），无边界信息流告诉我们要让数据自由流动（而非锁在孤岛里）。

变化的是治理对象：从数据库和中间件，变成了Agent Skill和提示词；从SOA服务契约，变成了Skill间的JSON数据格式；从ESB企业服务总线，变成了消息路由器和记忆系统。不变的是方法论：先做业务能力域划分（BA），再定义数据流转标准（DA），然后设计组合依赖图谱（AA），最后选型基础设施（TA）。

AI Agent生态正处在从"能用"到"好用"的拐点。能写一个Skill不难，难的是让100个Skill有序共存、可组合、可演化。企业架构不是AI开发者的负担，而是避免"巴别塔"重演的脚手架。当Skill数量突破临界点时，有没有架构，决定了生态是繁荣还是坍塌。

Agent Skills 开源生态

本文涉及的技能和框架已开源，欢迎 Star / Fork / PR：

仓库	内容	协议	链接
financial-ai-skills	104个金融AI技能，零API费	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5个通用Agent技能(评分引擎/证据链/数据聚合/可视化/NL2Query)	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	5层集群通信技能(L1-L5)	Apache 2.0	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	208技能分类体系+L0-L4框架+YAML模板	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12个零依赖金融H5演示	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

AI生成

4 Industries 36 Roles 184 Scenarios: A Complete Financial AI Map

兆鹏于 — Fri, 03 Jul 2026 12:37:22 +0000

4大行业36岗位184场景：金融AI全覆盖实战地图

引言：场景够不够，从来不是问题

金融行业谈AI落地，最常听到的问题是"我们有没有场景"。这个问题本身就问错了。

事实是：一家中型银行从总行到支行，从前台营销到后台审计，天然就存在上百个AI可介入的业务节点。真正的挑战不是"有没有场景"，而是"场景怎么系统化覆盖"——零散试点容易，全行铺开难；单点出彩容易，规模化复制难。

我们花了18个月，跑遍了银行、证券、保险、基金四大行业，逐一拆解每个岗位的日常工作流，最终梳理出一组数据：4大行业、36个岗位、184个智能场景——并且全部已上线运行。

这不是PPT上的规划，是实打实的生产数据。本文将完整呈现这组数据的结构、背后的技术架构，以及从184个场景落地中提炼出的方法论。

全景数据：4行业 x 36岗位 x 184场景

先看全貌：

行业	岗位数	场景数	占比
银行业	24	119	64.7%
证券业	5	30	16.3%
保险业	4	18	9.8%
基金业	3	17	9.2%
合计	36	184	100%

银行以119个场景占据近三分之二，这与银行业组织架构最复杂、岗位分工最细直接相关。证券、保险、基金相对聚焦，但每个行业都有其不可替代的专属场景——比如保险的智能核保、基金的估值核算，在其他行业没有直接对标。

再看每个行业的代表性岗位与场景：

银行业——选取6个核心岗位展示：

岗位	代表场景	功能说明	状态
零售客户经理	360度客户画像	基于RFM模型生成客户画像	已上线
零售客户经理	AI营销话术	输入客户信息，秒级生成专业话术	已上线
对公客户经理	企业尽调报告	多节点协同企业风险扫描	已上线
理财经理	资产配置方案	基于风险偏好的配置建议	已上线
风控经理	反欺诈预警	AI识别欺诈交易模式	已上线
财务会计	智能审计抽样	风险导向审计抽样	已上线

证券业——选取5个岗位展示：

岗位	代表场景	功能说明	状态
投资银行	IPO尽调辅助	招股书材料整理	已上线
研究所	研报智能生成	智能研报效率提升50%	已上线
资产管理	业绩归因	投资收益归因分析	已上线
经纪业务	投资顾问话术	适当性合规话术	已上线
自营交易	交易指令优化	算法交易辅助	已上线

保险业——选取4个岗位展示：

岗位	代表场景	功能说明	状态
销售渠道	客户保障需求分析	缺口分析加方案推荐	已上线
核保理赔	智能核保	风险评估加自动决策	已上线
精算产品	产品定价辅助	费率厘定模型	已上线
客户服务	客户回访外呼	AI外呼替代人工	已上线

基金业——选取3个岗位展示：

岗位	代表场景	功能说明	状态
投研	量化策略辅助	因子分析加回测	已上线
销售	机构销售辅助	定制化推介材料	已上线
运营	估值核算辅助	基金会计日报	已上线

以上仅展示每个行业的代表性切片。银行另有17个岗位的百余场景未逐一展开，证券另有25个场景、保险另有14个场景、基金另有14个场景同样已上线运行。

银行业深度：24岗位119场景的拆解

银行业场景数量是其他三个行业之和的近两倍，值得单独拆开来看。按前中后台分层：

层级	岗位数	场景数	代表岗位
前台-营销服务	6	31	零售客户经理、对公客户经理、理财经理、大堂经理、私人银行客户经理、信用卡专员
前台-金融市场	4	21	交易员、投资经理、研究员、资金运营
中台-风险管理	5	21	信用审批、风控经理、合规专员、反洗钱专员、法务
中台-产品运营	4	18	产品经理、运营管理、渠道管理、数字化运营
后台-支持保障	5	28	财务会计、人力资源、信息技术、审计、行政

三个值得注意的结构特征：

第一，前台场景占比最高（52个，占43.7%）。 这是AI创造直接收入的主战场。以零售客户经理为例，6个场景形成完整闭环：画像→分层→推荐→话术→召回→训练。单个场景独立可用，串联起来就是一套AI驱动的客户经营体系。信用卡专员的6个场景同样形成闭环：开卡话术→权益激活→分期推荐→进件核对→审批辅助→逾期催收，覆盖了信用卡从获客到催收的全生命周期。

第二，后台场景密度最高（5个岗位28个场景，均值5.6）。 特别是财务会计一个岗位就有9个场景，覆盖从发票查验、预算管控到智能审计抽样的全流程。这些场景单体价值不高，但胜在刚性需求强、替代人工效果显著，是银行AI落地的"基本盘"。

第三，中台风险管理的场景刚性最强。 风控经理、合规专员、反洗钱专员共13个场景，每一个都直接关联监管要求，不来虚的。反欺诈预警、可疑交易识别、合规话术红绿灯——这些场景的需求方不是业务线而是合规部，需求稳定性远高于营销类场景。

再选取银行3个高密度岗位展开细节：

研究员（7个场景）——银行前台场景最多的岗位：

序号	场景名称	功能说明	状态
1	研报智能生成	智能研报效率提升50%	已上线
2	调研纪要生成	录音转文字加要点提炼	已上线
3	市场观点输出	日报/周报自动生成	已上线
4	财报数据提取	自动提取关键财务指标	已上线
5	研报知识库检索	智能研报检索与问答	已上线
6	ESG研究	ESG评级与投资建议	已上线
7	宏观研究分析	宏观经济与政策分析	已上线

研究员的7个场景分布在三种工作模式上：信息获取（财报提取、知识库检索）占2个，内容生成（研报、纪要、观点）占3个，深度分析（ESG、宏观）占2个。这正是AI介入研究工作的最佳切入点：先替代机械性的信息整理，再辅助半结构化的内容生成，最后增强非结构化的深度分析。

财务会计（9个场景）——银行业场景最多的单体岗位：

序号	场景名称	功能说明	状态
1	发票查验	发票真伪加信息提取	已上线
2	预算管控	部门预算执行分析	已上线
3	财报速读	AI提取关键财务数据	已上线
4	税务筹划	税负分析加节税建议	已上线
5	费用报销	智能审核加合规检查	已上线
6	资金预测	现金流缺口预警	已上线
7	财务规划	全面财务规划与分析	已上线
8	资产负债管理	ALM资产负债优化	已上线
9	智能审计抽样	风险导向审计抽样	已上线

9个场景可归为三类：事务处理类（发票查验、费用报销）直接替代重复劳动，决策辅助类（预算管控、税务筹划、资金预测、资产负债管理）提供分析建议，报告生成类（财报速读、财务规划、审计抽样）自动化文档工作。三类场景对AI的能力要求逐级递进，落地难度也逐级递增。

风控经理（6个场景）——风险管理的AI中枢：

序号	场景名称	功能说明	状态
1	风险扫描	多维风险识别	已上线
2	风险预警	实时风险监测	已上线
3	压力测试	情景分析加压力测试	已上线
4	反欺诈预警	AI识别欺诈交易模式	已上线
5	操作风险识别	业务流程风险监测	已上线
6	客户风险评级	AML/KYC风险评级	已上线

风控场景有一个共同特征：输入高度结构化（交易数据、客户画像、行为日志），输出需要可解释（监管要求）。这使得风控场景成为评分引擎类Handler的最佳试验田——ScoringEngineHandler的权重透明、规则可审计，恰好满足合规需求。

技术架构：7个Handler + 场景注册中心

184个场景不是184个独立开发的项目。背后是一套"场景注册中心 + 7个Handler"的统一架构：

组件	职责	典型场景举例
KnowledgeRAGHandler	知识库检索与RAG问答	合规自查、研报知识库检索、产品手册问答
DataAnalysisHandler	结构化数据分析	渠道效能分析、业绩归因、用户增长方案
ReportGenHandler	报告自动生成	研报生成、尽调报告、审计底稿、投研报告
TextGenHandler	文本生成（话术/文案/纪要）	AI营销话术、朋友圈文案、会议纪要、公文写作
ScoringEngineHandler	多维评分引擎	信用评估、客户风险评级、智能审批辅助
OCRParserHandler	文档解析与信息提取	发票查验、进件材料核对、财报数据提取
VisualizationHandler	数据可视化与看板	数据看板生成、客户热力图、风险仪表盘

场景注册中心维护一张完整的场景路由表：每个场景对应哪个Handler、输入什么数据、输出什么格式、调用哪个Skill，全部声明式定义。新增场景不需要从零开发，只需在注册中心登记一条配置，指明Handler类型和提示词模板即可。

这就是为什么184听起来很多，但开发工作量并非184倍——大量场景共享同一套Handler逻辑，区别仅在于提示词模板和数据源。以TextGenHandler为例，它同时支撑了零售话术、对公话术、理财话术、信用卡话术、保险条款解读、朋友圈文案等20余个场景，核心逻辑完全一致。

开发效率：7天完成27个新场景

这套架构带来的直接收益是开发效率的质变。一个典型的冲刺周期：

指标	数据
冲刺周期	7天
新增场景数	27个
平均单场景耗时	约2.5小时
复用Handler占比	82%（22/27）
纯新Handler开发	5个场景需新建

27个场景中，22个直接复用现有Handler——TextGenHandler一个组件就吃掉了其中8个话术类场景，KnowledgeRAGHandler覆盖了5个知识检索类场景。真正需要新建逻辑的场景集中在保险精算和基金估值等垂直领域，这些是行业壁垒所在，投入产出合理。

对比传统开发模式：如果每个场景独立开发，27个场景平均需要2-3人天/个，总工期约54-81人天。Handler复用把这个数字压缩到原来的十分之一。更重要的是，复用模式确保了场景间的体验一致性——用户在不同岗位上使用不同场景，交互范式相同，学习成本趋近于零。

落地启示：从"点状试点"到"面状覆盖"

回顾184个场景的成功落地，有三条经验值得提炼：

第一，先铺广度再挖深度。 不要在一个岗位上把场景做到极致再拓展下一个岗位。先让每个岗位有1-2个核心场景跑起来，建立信心和使用习惯，再逐步补齐。银行业从6个前台岗位的18个场景起步，6个月内扩展到24岗位119场景。先有覆盖面，才有深度的意义。

第二，用Handler复用率控制开发成本。 场景数量是面子，Handler复用率是里子。如果新增10个场景需要开发10个新组件，架构就有问题。健康的比例是70%以上场景复用现有Handler。经验阈值：当复用率低于60%，说明行业拆分还不够细，需要重新提取共性。我们的实测数据是82%，说明7个Handler的抽象粒度是合理的。

第三，场景的上线顺序有讲究。 推荐路径：先上高频低风险场景（话术生成、文档提取、信息检索），再上高频高风险场景（信用评估、反欺诈），最后上低频高价值场景（家族信托方案、ESG研究）。让业务一线先用起来，让合规团队有时间建立信任。在我们落地过程中，第一批上线的30个场景全部是高频低风险类，上线后零合规事故，为后续攻坚奠定了信任基础。

184个场景不是终点。每新增一个行业客户，都会发现新的岗位和场景需求。但有了Handler加注册中心的架构基座，从184到300，增量成本远低于从0到184。金融AI的最后一公里，拼的不是场景数量，而是场景的系统化覆盖能力。

Agent Skills 开源生态

本文涉及的技能和框架已开源，欢迎 Star / Fork / PR：

仓库	内容	协议	链接
financial-ai-skills	104个金融AI技能，零API费	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5个通用Agent技能(评分引擎/证据链/数据聚合/可视化/NL2Query)	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	5层集群通信技能(L1-L5)	Apache 2.0	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	208技能分类体系+L0-L4框架+YAML模板	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12个零依赖金融H5演示	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

AI生成

The Last Mile of Financial AI: From Usable to Trustworthy

兆鹏于 — Fri, 03 Jul 2026 12:37:11 +0000

金融AI的最后一公里：从"能用"到"可信"的跃迁

大模型参数突破万亿，金融机构AI项目仍困在PPT阶段。本文直指落地结构性困境——"能说不会做"，提出从工具到集群的认知跃迁路径，并以龙马金融智能体集群两年实战为基础，展示四大行业36岗位184场景的落地全貌。

引言：AI的"最后一公里"

2026年，全球大模型参数规模突破万亿，"最强模型"每季度易主。然而麦肯锡2025年调研显示：全球78%的金融机构已启动AI项目，只有12%进入生产环境。中国的情况更典型——某头部银行2024年上线37个AI应用，6个月后仍在使用的仅9个，存活率不到四分之一。

这不是技术问题。问题出在"最后一公里"——从"AI能做什么"到"AI在业务场景中能可靠地做什么"，中间横亘着一条深不见底的鸿沟。金融场景对AI的容错空间几近于零：一个误写的数字可能导致数百万风险敞口，一次概念混淆可能引发合规处罚。

过去两年多，龙马金融智能体集群从25个场景起步，将覆盖面扩展到银行、证券、保险、基金四大行业：

行业	岗位数	场景数	代表功能
银行	19	118	智能分流、尽调报告、反欺诈预警、发票查验
证券	6	25	投研报告、调研纪要、路演材料
保险	5	22	智能核保、理赔审核、续保预警
基金	6	19	基金研报、组合优化、业绩归因

三重困境：AI为什么"能说不会做"

单点工具的割裂

金融机构不缺AI工具，问题在于每个工具只做了流程的一个片段，片段之间断裂。发票OCR做完，查验真伪要登录国税网站，核对合规要翻规则手册，判断入账要对接财务系统——AI只参与了第一环，后面全靠人工。

某城商行采购"AI财务套件"，包含发票识别、预算管理、报销审核三个模块。上线三个月后发现：发票识别输出Excel，预算管理要求JSON，每周IT手动转换；报销审核通过的发票无法同步扣减预算，超支预警形同虚设；三个模块三套权限，主管维护三遍。最终回到Excel手工台账。AI省下的时间，被系统切换消耗殆尽。

AI幻觉的高风险

某券商"AI投研助手"上线首月：将"归母净利润"误写为"扣非净利润"，估值模型完全偏离；计算市盈率使用过期股价，"低估"结论与现实相反。

大模型不是在"理解"金融数据，而是在"预测下一个最可能出现的词"。金融对精确性的要求，把幻觉从"小毛病"放大为"致命伤"——营收增长15.3%和15.8%差0.5个百分点，在信贷审批中可能意味着风险等级跨档。

POC到生产的断崖

某银行"智能报销系统"，POC测试500张标准发票识别率99.2%，上线首月处理12000张真实发票，准确率骤降至47.3%。那500张全是扫描仪高清增值税专用发票——从99%到47%的断崖，是理想数据与真实世界的鸿沟。更隐蔽的是信任危机：AI给出结论却无法解释"为什么"，黑盒天然引发怀疑。

三重困境叠加，形成金融AI的"不可能三角"：不可能同时拥有单点工具的灵活、通用模型的能力和金融场景的可靠性。突破不可能三角，需要全新的认知框架。

认知跃迁：工具→智能体→集群

工具思维："我有一款AI，能帮我做某件事"

工具思维下AI是被动的——用户发指令，AI返回结果，任务结束。局限在于：它假设用户知道该做什么，但金融场景复杂度常超出个体认知边界。早期做企微推送时，Markdown格式报告在手机端表格挤压、核心指标淹没、无行动引导——典型的工具思维，"我有个推送功能，能把报告发出去"，发是发了，体验一塌糊涂。

智能体思维："我有一个AI，能理解目标，自主完成任务"

智能体思维下AI是主动的。用户描述目标——"帮我完成这家企业的尽调"——智能体自动拆解任务、调用能力、逐步执行。从Markdown升级为企微Template Card+H5详情页：

卡片只做摘要：风险等级大字号高亮，3秒抓核心
底部按钮引导：点击查看完整H5分析报告
分层设计：卡片是封面，H5是正文

首屏从25行压缩到8行，风险等级一眼可见，行动按钮引导深度阅读，卡片发送失败自动降级为Markdown。这本质上是从"工具思维"到"智能体思维"的微观跃迁——不是"发送信息"，而是"以用户目标为中心组织信息呈现"。

关键是：智能体不要求AI"更聪明"，而要求"更有纪律"。聪明由底层模型决定，纪律由架构设计保证。纪律比聪明更可靠。

集群思维："多个专业智能体各司其职，协同完成复杂任务"

一个智能体再强也有边界。企业尽调需同时处理财务分析、行业研究、风险扫描、合规审查——每个维度需不同专业能力。靠一个智能体"包打天下"，疑难杂症看不了。

集群思维的要义：不是让一个智能体什么都做，而是让多个专业化智能体协同完成单个智能体无法完成的复杂任务。集群的价值不是"拼工具"，而是通过协同产生质变——交叉验证发现单一视角看不到的风险，能力互补弥补短板。不是1+1=2，而是1+1>4。

为什么大于4？两个节点交叉验证，不是多了"一个视角"，而是多了一个"交叉点"——财务分析发现"ROE异常偏高"，风险扫描发现"存在大额关联交易"，交叉产生新洞察：高ROE可能来自关联交易虚增收入。两节点单独都看不到。

三次跃迁的逻辑：工具思维解决"能不能做"，智能体思维解决"会不会做"，集群思维解决"做得好不好"。金融AI的核心挑战，恰恰在最难的第三层。

方法论：专业分工、强制编排、多源验证

专业分工：让每个节点只做一件事，做到极致

集群设计的首要原则是单一职责。龙马集群六节点分工：有的擅长长文本分析和报告生成，有的擅长代码执行和数据采集，有的专门守护隐私数据本地处理，有的负责定时运维和文件操作——每个节点有明确的能力边界，边界内追求极致，边界外绝不越界。

分工的核心不是"能不能"，而是"该不该"。金融跨域操作的结果往往是"及格但不优秀"，在金融场景中"及格"等于不及格。副产品是容错：一个节点离线，其他照常工作。

强制编排：流程不是建议，是纪律

金融业务流程有严格顺序：贷前尽调必须在审批之前，风险扫描必须在授信之前。四阶段强制编排：任务解析与输入校验→核心执行→结果校验与质量门禁→输出交付与归档沉淀，每步有前置条件和退出标准。

某企业工商注册信息与财报公司全称差两个字——"集团"vs"股份"——完整性校验拦截了这一步，避免将两家不同企业的数据混淆。强制编排还解决了进度透明问题，用户能看到每步状态，消除黑盒焦虑。

多源验证：不依赖单一模型输出，交叉校验抵抗幻觉

分析某上市公司时，一个节点从新闻舆情提取到"获得政府补贴5亿元"，另一个节点从财报附注发现"政府补助实际到账1.2亿元"。系统标记人工复核。核实后，"5亿元"是"拟申请金额"而非实际到账——AI误将"计划"当"事实"。没有交叉验证，此错误可能进入风险评估模型，导致偿债能力过度乐观。多源验证的价值不在于消除错误，而在于让错误在输出前被捕获。

三位一体：分工决定"能做什么"，编排决定"怎么做事"，验证决定"做出来的能不能信"。 金融场景三者缺一不可。

实战实证

对公尽调：5天→10分钟

企业尽调信息分散在天眼查、巨潮资讯、裁判文书网等数十个渠道，传统模式平均耗时3至5个工作日，不同客户经理分析深度参差不齐。龙马集群采集、分析、扫描三路并行启动，各自完成后汇总校验。信息采集从5天缩短到10分钟，检查点从人工50个提升到100个以上。智能体负责"穷尽"，审批人负责"取舍"——穷尽可标准化，取舍需经验。

零售营销：精准滴灌

传统营销标签粗粒度，短信打开率不足2%。集群采用"洞察引擎+推荐引擎+效果追踪"三引擎协同，构建100+标签动态画像，识别需求窗口期。A/B测试：传统短信打开率3.2%→智能体个性化10.1%；转化率0.5%→4.2%，提升超7倍。

风控合规：42%欺诈识别提升

传统规则引擎以静态阈值为核心，单笔转账超50万触发预警。欺诈团伙通过数百个空壳公司分散交易，单笔均在阈值以下，传统规则视而不见。日均一万条预警中九千条是误报，风控人员80%时间花在筛选噪音上。

集群采用三层架构：数据层实时采集交易特征，分析层并行运行规则引擎、异常检测、关联图谱三种分析，决策层综合评估并触发处置。动态规则是关键突破——阈值根据客户画像动态调整，让规则"理解"上下文。效果：欺诈识别率提升42%，误报率降低70%，响应从T+1缩短到秒级。关联图谱将五个看似无关账户因共享手机号、交叉转账、共同地址串联成团伙欺诈链路——这不是规则能查到的，只有"关系视角"才能揭示。多源验证在风控场景不仅是"交叉校对"，更是"维度升维"——从看"点"到看"网"。

财富管理：普惠智慧

传统财富管理悖论：最需要专业配置的客户享受不到专业服务，私行门槛600万+。集群转向"目标驱动"——不问"买什么产品"，问"实现什么人生目标"。每个目标对应不同资产配置方案，方案随市场变化和人生阶段动态调整。边际成本趋近于零，服务门槛可无限降低。这不是慈善，是技术驱动的商业模式重构——让专业财富管理从少数人的奢侈品变为多数人的基础设施。

核心洞察：从"能用"到"可信"的质变逻辑

四大场景指向同一结论：金融AI核心挑战不是"让AI能用"，而是"让AI可信"。三个条件缺一不可——

可解释。 每个结论追溯到推理过程。审批人问"为什么拒绝这笔贷款"，需看到完整链路。链式解释是监管合规硬性要求，更是人机信任的基础设施。

可追溯。 每步操作留痕。某银行审计发现AI审批通过率某月异常上升15%，回溯日志发现某节点升级后遗漏"行政处罚"字段抓取，导致部分企业误判低风险。可追溯性平时看似多余，出事时是救命绳。

可校验。 关键结论不依赖单一来源。两路独立采集、交叉比对，将"信任单一模型"转变为"信任事实交叉"。

可解释、可追溯、可校验——不是技术指标，是金融AI的"信任基座"。 更深层地说，可信AI的核心不在模型多强大，而在错误的"可捕获率"有多高。偶尔犯错但错误总能被发现的系统，比很少犯错但一旦犯错无法察觉的系统更可信。集群不是消灭错误，而是让错误"无处遁形"。

结语

从"能说不会做"的结构性困境，到"工具→智能体→集群"的认知跃迁，再到"专业分工+强制编排+多源验证"的方法论框架——这不是平坦的成功之路，而是充满试错的探索之路。120个Coze智能体的大规模实践教会我们：单体智能体无论数量多少，都无法解决跨场景协同。从量变到质变的催化剂不是数量，而是协作方式。

金融AI不是一个技术命题，而是一个信任命题。技术再强，不能让审批人、风控人员、监管者信任，就只是实验室花火。信任从何而来？靠每个结论有据可查、每个步骤有迹可循、每个风险有法可校。信任靠制度积累，不靠说服建立。

金融AI的"最后一公里"，不是技术问题，是工程问题；不是能力问题，是方法问题；不是"能不能做"的问题，是"怎么做才可信"的问题。

Agent Skills 开源生态

本文涉及的技能和框架已开源，欢迎 Star / Fork / PR：

仓库	内容	协议	链接
financial-ai-skills	104个金融AI技能，零API费	MIT	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5个通用Agent技能(评分引擎/证据链/数据聚合/可视化/NL2Query)	Apache 2.0	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	5层集群通信技能(L1-L5)	Apache 2.0	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	208技能分类体系+L0-L4框架+YAML模板	MIT	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12个零依赖金融H5演示	MIT	https://github.com/yuzhaopeng-up/fintech-h5-demos

How to Govern 200+ Agent Skills: L0-L4 Classification, YAML Templates, and Python Toolchain

兆鹏于 — Wed, 01 Jul 2026 12:47:28 +0000

How I Manage 200+ Agent Skills: L0-L4 Classification + YAML Templates + Python Toolchain

When your Agent project balloons past 200 Skills, "it works" and "it's manageable" are two very different things. In this post, I'll walk you through an open-source governance framework—skill-framework—that uses a five-tier classification model, standardized templates, and an automated toolchain to take Agent skills from wild-west chaos to engineering-grade ops.

1. The Problem: Why Do You Even Need a Skill Governance Framework?

The Agent ecosystem is repeating the same mistake microservices made—grow wild early, lose control later.

Classic symptoms:

Symptom	What It Looks Like	Root Cause
Hard to locate	"Where's that credit-check Skill again?"	No unified classification, skills piled flat
Dependency chaos	Tweak one atomic Skill, 3 scenario Skills break	Dependencies spread by word of mouth, no explicit declarations
Format drift	Same team, different field names and structures	No enforced templates, convention is voluntary
Production incidents	New Skill ships with zero security audit	No quality gate, no checklist
Reuse deadlock	Project A wrote a Skill, Project B has no idea it exists	No industry blueprints, start from scratch every time

At 10 Skills, you can keep it all in your head. At 50, you patch with docs. At 200+—you need an engineering framework.

skill-framework exists for exactly this: https://github.com/yuzhaopeng-up/skill-framework

2. The Core: L0-L4 Five-Tier Classification Model

The foundation of the whole framework is a five-tier model. Each tier has a clear responsibility boundary and dependency direction—higher tiers depend on lower ones, lower tiers never know about higher ones.

┌─────────────────────────────────────┐
│  L4 Multi-Agent   Agent Team Orchestration │  ← Team orchestration
├─────────────────────────────────────┤
│  L3 Scenario       Business Composition    │  ← Business composition
├─────────────────────────────────────┤
│  L2 Gateway/Routing  Intent Routing       │  ← Intent routing
├─────────────────────────────────────┤
│  L1 Base Skills    Atomic Skills         │  ← Atomic skills
├─────────────────────────────────────┤
│  L0 Infrastructure  Infra Connectors      │  ← DB/API connectors
└─────────────────────────────────────┘

Tier Breakdown

Tier	Name	Responsibility	Scope	Typical Examples
L0	Infrastructure	Connect to external systems, wrap data source access	DB connectors, API clients, file I/O	`mysql-connector`, `redis-client`, `oss-file-handler`
L1	Base Skill	Atomic operations, single responsibility, independently executable	Info extraction, data query, report generation, security checks	`info-extractor`, `data-analyst`, `report-generator`, `security-guard`
L2	Gateway/Routing	Accept natural language input, identify intent, route to the right L1 Skill	Intent recognition, permission checks, query dispatch	`nl2-query`, `l3-gw-01` (data query gateway)
L3	Scenario (Ceiling)	Orchestrate multiple L1/L2 Skills into end-to-end business flows	Multi-phase pipelines, business composition	`scoring-engine` (opportunity scoring), `evidence-chain` (evidence chain analysis)
L4	Multi-Agent	Spin up independent sub-Agent teams, role isolation, parallel collaboration	Team orchestration, task scheduling	`agent-teams-orchestrator`, `l7-arkclaw-01` (enterprise ops assistant)

Key constraints:

One-way dependency: L4 → L3 → L2 → L1 → L0, reverse dependencies strictly forbidden
L1 is stateless: Base skills must be pure-function-style, no session state
L2 is stateful: Gateway tier manages session context and routing tables
L3 orchestrates, doesn't execute: Scenario tier only schedules; actual execution sinks to L1
L4 runs in isolation: Each sub-Agent has independent context, data passes via structured JSON

The biggest value of this model isn't theoretical completeness—it's that it assigns each of the 208 Skills to exactly one tier. When you need to find a Skill, first pin down the tier, then narrow by domain, and you're looking at 10–20 candidates max.

3. YAML Templates: 3 Ready-to-Copy Specifications

skill-framework provides three YAML templates covering the three most common Skill shapes:

Template	Target Tier	File	Key Feature
L1 Base Skill	L1 Base Skill tier	`templates/l1-base-skill.yaml`	Single responsibility, declares inputs/outputs and trigger keywords
L2 Gateway Skill	L2 Gateway/Routing tier	`templates/l2-gateway-skill.yaml`	Routing table + permission checks + downstream dependency declarations
L3 Ceiling Skill	L3 Scenario tier	`templates/l3-ceiling-skill.yaml`	Multi-phase orchestration + structured JSON data passing

L1 Base Skill Template Example

# templates/l1-base-skill.yaml
skill_name: ""                    # Required: skill name, kebab-case
skill_level: L1                   # Required: fixed at L1
version: "1.0.0"                  # Required: semantic version

description: ""                   # Required: one-line description
trigger_keywords: []              # Required: trigger keyword list
  # - "keyword1"
  # - "keyword2"

inputs:                           # Required: input parameter definitions
  - name: ""                      # Parameter name
    type: ""                      # Type: string/number/boolean/json/file
    required: true                # Is it required?
    description: ""               # Parameter description

outputs:                          # Required: output definitions
  - name: ""
    type: ""
    description: ""

dependencies:                     # Dependency declarations (L0 only)
  - skill_name: ""
    version: ">=1.0.0"
    usage: ""                     # What this dependency is used for

execution:                        # Execution spec
  type: prompt                    # prompt | python | hybrid
  timeout_seconds: 300            # Timeout
  retry_policy:
    max_retries: 2
    backoff: exponential

security:                         # Security declarations
  data_access_scope: []           # Data access scope
  sensitive_fields: []            # Sensitive field list
  audit_logging: true             # Enable audit logging?

quality:                          # Quality metrics
  min_accuracy: 0.85              # Minimum accuracy
  test_cases: []                  # Test case paths

Field design philosophy:

skill_level is mandatory—the toolchain uses it for dependency legality checks
dependencies is restricted to same-tier or lower; L1 can only depend on L0
security section is required for the quality gate—missing any item gets blocked by skill-lint
quality section is currently declarative; future versions will plug into automated test frameworks

Key Differences in L2 and L3 Templates

L2 Gateway adds:

routing_table:                    # L2-only: routing table
  - intent: ""                    # User intent
    target_skill: ""              # Target Skill to route to
    confidence_threshold: 0.8     # Confidence threshold
permission_check:                 # L2-only: permission checks
  enabled: true
  whitelist: []

L3 Ceiling adds:

phases:                           # L3-only: multi-phase orchestration
  - phase: 1
    name: ""
    skill: ""                     # L1/L2 Skill being called
    input_mapping: {}             # Input mapping
    output_key: ""                # Key to store output
  - phase: 2
    name: ""
    skill: ""
    input_mapping: {}
    output_key: ""
orchestration:                    # L3-only: orchestration strategy
  mode: sequential               # sequential | parallel | conditional
  failure_policy: stop            # stop | skip | retry

4. Python Toolchain: Scan → Lint → Backfill Pipeline

skill-framework bundles 3 Python tools that form an automated pipeline from discovery to compliance:

inventory-scan  ──→  skill-lint  ──→  backfill-frontmatter
   (scan & build inventory)  (compliance check)   (backfill frontmatter)

Tool 1: inventory-scan — Scan & Build Inventory

Scans all Skills under a directory, auto-detects tiers, extracts metadata, and generates a unified inventory.

# Basic: scan all Skills in your project
python tools/inventory-scan.py --root ./skills --output ./output

# With tier validation: auto-detect level tag legality
python tools/inventory-scan.py \
  --root ./skills \
  --output ./output \
  --validate-levels

Outputs:

unified_skill_inventory.json — Complete inventory of 208 Skills (structured JSON)
unified_skill_inventory.csv — Same inventory in tabular form (easy Excel filtering)
skills_dependencies.json — Skill dependency graph

Inventory JSON structure example:

{
  "scan_timestamp": "2026-07-01T10:00:00Z",
  "total_skills": 208,
  "by_level": {
    "L0": 12,
    "L1": 86,
    "L2": 24,
    "L3": 58,
    "L4": 28
  },
  "skills": [
    {
      "name": "info-extractor",
      "level": "L1",
      "version": "2.1.0",
      "description": "从非结构化文本中提取结构化字段",
      "trigger_keywords": ["信息提取", "提取字段", "结构化"],
      "dependencies": ["mysql-connector@L0"],
      "file_path": "skills/info-extractor/SKILL.md"
    }
  ]
}

Tool 2: skill-lint — YAML Compliance Checker

Checks each Skill's YAML declarations against the template spec and outputs a violation report.

# Check a single Skill
python tools/skill-lint.py --target ./skills/info-extractor

# Batch check the whole project
python tools/skill-lint.py --root ./skills --template-dir ./templates

# Strict mode: treat Warnings as Errors
python tools/skill-lint.py --root ./skills --strict

Sample lint rules:

Rule ID	Level	What It Checks
`L001`	Error	skill_level must be one of L0–L4
`L002`	Error	Dependencies must be at a lower tier than the current Skill
`L003`	Error	security section cannot be empty
`L004`	Warning	Recommend adding description for every input
`L005`	Error	L3 Skills must have a phases section
`L006`	Warning	trigger_keywords should have at least 3 entries

Output example:

[ERROR] L002: skills/scoring-engine/skill.yaml
  → dependency "data-analyst" is L1, same level as current skill (L3)
  → Suggestion: L3 skills should depend on L1/L2, not other L3 skills directly

[WARNING] L004: skills/info-extractor/skill.yaml
  → Input "raw_text" missing description
  → Suggestion: Add description field for better discoverability

Scan complete: 208 skills checked, 3 errors, 7 warnings

Tool 3: backfill-frontmatter — Auto-Fill Missing Frontmatter

For Skill files missing YAML frontmatter, this tool extracts content from SKILL.md and generates standard frontmatter.

# Dry run: preview what would be backfilled
python tools/backfill-frontmatter.py --root ./skills --dry-run

# After reviewing, apply changes
python tools/backfill-frontmatter.py --root ./skills --apply

Typical scenario: Your team wrote SKILL.md files early on without filling in YAML templates. This tool will:

Read the description and trigger keywords from SKILL.md
Infer skill_level from file path and content
Scan import/require statements to extract dependencies
Generate template-compliant frontmatter and prepend it to the file

5. Skill Dependency Graph: skills_dependencies.json

The dependency graph is skill-framework's second-largest data asset (after the inventory itself). It explicitly declares the call relationships between Skills.

Structure design:

{
  "version": "1.0.0",
  "generated_at": "2026-07-01T10:00:00Z",
  "nodes": [
    {
      "id": "info-extractor",
      "level": "L1",
      "group": "data-processing"
    },
    {
      "id": "scoring-engine",
      "level": "L3",
      "group": "risk-management"
    }
  ],
  "edges": [
    {
      "from": "scoring-engine",
      "to": "info-extractor",
      "type": "phase-1",
      "data_contract": "structured-json"
    },
    {
      "from": "scoring-engine",
      "to": "knowledge-rag",
      "type": "phase-2",
      "data_contract": "structured-json"
    }
  ],
  "orphan_nodes": ["unused-skill-demo"]
}

Three killer use cases:

Change impact analysis: Before modifying info-extractor, check edges to see which downstream Skills like scoring-engine are affected
Dead skill discovery: orphan_nodes lists Skills with zero dependencies—candidates for deletion or archival
Tier violation detection: Use alongside skill-lint to catch illegal calls like L1 depending on L3

6. Quality Gate: 6-Step Checklist from Dev to Production

skill-framework ships with audit-checklist.md that defines a 6-step quality gate. Every Skill must pass all items before going live:

Step	Check Item	Owner	Tool Support
1. Structural compliance	YAML fields complete, tier correct	Developer	`skill-lint`
2. Dependency legality	One-way dependencies, no cycles	Developer	`inventory-scan --validate-levels`
3. Security audit	Minimal data scope, sensitive fields masked	Security reviewer	`skill-lint L003`
4. Integration test	End-to-end flow verification, timeout & retry testing	QA engineer	Manual + automated test framework
5. Documentation completeness	README, trigger keyword examples, I/O samples	Developer	`backfill-frontmatter --dry-run`
6. Production approval	Manual sign-off + archived approval record	Tech lead	Manual

Practical command sequence:

# Step 1: Structural compliance
python tools/skill-lint.py --root ./skills --strict

# Step 2: Dependency legality
python tools/inventory-scan.py --root ./skills --validate-levels

# Step 3: Security audit (focus on L003 rule)
python tools/skill-lint.py --root ./skills --rule L003 --strict

# Step 4-5: Documentation backfill
python tools/backfill-frontmatter.py --root ./skills --dry-run
# review dry-run output, then:
python tools/backfill-frontmatter.py --root ./skills --apply

# Step 6: Manual approval (review check report, sign off)
cat output/audit-report.md

7. Four Industry Vertical Blueprints

The framework bundles 4 industry blueprints, each pre-defining the core Skill combinations and dependency relationships for that industry:

Industry	Blueprint File	Core Skill Combo	Special Component
Finance	`blueprints/finance.yaml`	Risk scoring, compliance review, investment research, customer profiling	`financial-ai-skills` integration
Telecom	`blueprints/telecom.yaml`	Complaint analysis, field service dispatch, 5G private network assessment, network root cause	`teleagent-skills` integration
Healthcare	`blueprints/healthcare.yaml`	Medical record extraction, diagnosis assistance, medication review, scheduling optimization	HIPAA compliance Skill
Government	`blueprints/government.yaml`	Document review, policy interpretation, public opinion analysis, approval workflows	Red-header document parser Skill

How to use:

# Initialize a project based on the finance blueprint
python tools/inventory-scan.py \
  --blueprint blueprints/finance.yaml \
  --init ./my-finance-project

The blueprint auto-generates the Skill directory skeleton, dependency declarations, and pre-filled YAML templates for that industry.

8. The Full 208-Skill Inventory at a Glance

unified_skill_inventory.json catalogs 208 Skills, distributed by tier as follows:

Tier	Count	Share	Representative Skills
L0	12	5.8%	mysql-connector, redis-client, oss-handler
L1	86	41.3%	info-extractor, data-analyst, report-generator, security-guard, knowledge-rag
L2	24	11.5%	nl2-query, l3-gw-01, data-query-gateway
L3	58	27.9%	scoring-engine, evidence-chain, live-stream-script-system, contract-review
L4	28	13.5%	agent-teams-orchestrator, l7-arkclaw-01, auto-pilot

The inventory comes in both JSON and CSV—JSON for programmatic use, CSV for Excel filtering and human review.

9. Quick Start

# Clone the repo
git clone https://github.com/yuzhaopeng-up/skill-framework.git
cd skill-framework

# Install dependencies
pip install -r requirements.txt

# Step 1: Scan your Skill project
python tools/inventory-scan.py --root /path/to/your/skills --output ./output

# Step 2: Compliance check
python tools/skill-lint.py --root /path/to/your/skills --template-dir ./templates

# Step 3: Backfill missing metadata
python tools/backfill-frontmatter.py --root /path/to/your/skills --dry-run

# Confirm and apply
python tools/backfill-frontmatter.py --root /path/to/your/skills --apply

Repo structure:

skill-framework/
├── templates/                    # 3 YAML templates
│   ├── l1-base-skill.yaml
│   ├── l2-gateway-skill.yaml
│   └── l3-ceiling-skill.yaml
├── tools/                        # 3 Python tools
│   ├── inventory-scan.py
│   ├── skill-lint.py
│   └── backfill-frontmatter.py
├── blueprints/                   # 4 industry blueprints
│   ├── finance.yaml
│   ├── telecom.yaml
│   ├── healthcare.yaml
│   └── government.yaml
├── data/                         # Data assets
│   ├── unified_skill_inventory.json
│   ├── unified_skill_inventory.csv
│   └── skills_dependencies.json
├── docs/
│   └── audit-checklist.md        # Quality gate checklist
├── LICENSE                       # MIT
└── README.md

10. Design Trade-offs

Decision	Choice	Rejected Alternative	Reason
Classification model	5 tiers	3 tiers, 7 tiers	5 tiers balances granularity and complexity
Template format	YAML	JSON Schema, TOML	YAML is readable, comment-friendly, and dominant in the Agent ecosystem
Dependency declarations	Static files	Runtime discovery	Static declarations enable offline checks and are secure/controllable
Tooling language	Python	Node.js	Higher Python penetration in AI/data teams
License	MIT	Apache 2.0	MIT is the most permissive, lowers adoption barrier

The Open-Source Agent Skills Ecosystem

skill-framework isn't an isolated project—it's the governance hub of an open-source Agent Skills ecosystem. These 5 repos work together to cover the full chain from Skill development to industry adoption:

Repo	Role	GitHub
skill-framework	L0-L4 classification + YAML templates + Python toolchain	https://github.com/yuzhaopeng-up/skill-framework
financial-ai-skills	Finance industry vertical Skill set (risk, compliance, research)	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	Telecom industry vertical Skill set (field service, complaints, 5G)	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	Multi-Agent cluster communication protocol & orchestration engine	https://github.com/yuzhaopeng-up/agent-cluster-comm
fintech-h5-demos	Fintech H5 interactive demos (courses/training)	https://github.com/yuzhaopeng-up/fintech-h5-demos

How they work together:

skill-framework defines the standards and tools; other repos follow its classification system and template specs
financial-ai-skills and teleagent-skills are two concrete implementations of industry blueprints
agent-cluster-comm provides the L4 multi-Agent orchestration communication protocol; skill-framework's L4 template is based on it
fintech-h5-demos is the front-end display layer, visualizing Skill execution results as interactive H5

License: MIT — use it, fork it, modify it, just don't delete the copyright notice.

If this framework helped you make sense of Skill governance, drop a Star on https://github.com/yuzhaopeng-up/skill-framework

Why Multi-Agent Clusters Can't Use One Communication Channel: A 5-Layer Stack from Encrypted P2P to GitHub Async Handoff

兆鹏于 — Wed, 01 Jul 2026 12:41:37 +0000

5-Layer Comm Stack: Why Multi-Agent Clusters Can't Use Just One Communication Channel — From Encrypted P2P to GitHub Async Handoff

A single communication channel can't solve every problem in a multi-agent cluster. Encrypted transport, real-time broadcast, human observability, cross-timezone async, and fault self-healing — each layer tackles one dimension. Stack them, and you've got a real solution.

A Real-World Failure Story

3 analysis agents form a real-time decision cluster, communicating over Redis Pub/Sub. Looks solid — until:

Agent-A needs to send intermediate results containing user ID numbers to Agent-B. Redis transmits in plaintext. Compliance audit: rejected.
New York's Agent-C goes offline after work. Next morning, they discover they missed 3 critical decision messages. Redis has no persistent replay.
The Redis node crashes at 2 AM. All 3 agents go silent for 45 minutes with nobody noticing. A major trading signal is missed.

This isn't hypothetical. It's the inevitable trap for every agent cluster that tries to "use one channel for everything."

The core insight: Multi-agent cluster communication needs are inherently multi-dimensional. Any single channel covers exactly one dimension. You need layered stacking.

5-Layer Communication Architecture Overview

┌─────────────────────────────────────────────────────────┐
│  L5  Health Monitor Layer                                │
│  Heartbeat detection · Fault discovery · Failover trigger │
├─────────────────────────────────────────────────────────┤
│  L4  GitHub Async Handoff Layer                          │
│  Issues as message queue · Zero deploy · Cross-TZ persist │
├─────────────────────────────────────────────────────────┤
│  L3  Chat Bot Layer                                      │
│  Human-observable · Approval flows · Notification broadcast│
├─────────────────────────────────────────────────────────┤
│  L2  Redis Message Bus Layer                             │
│  High-frequency broadcast · Pub/Sub · Low latency (<1ms) │
├─────────────────────────────────────────────────────────┤
│  L1  Encrypted Peer-to-Peer Layer                        │
│  E2E encryption · Cross-firewall · Webhook push · PII safe│
└─────────────────────────────────────────────────────────┘

Five-Layer Comparison Matrix

Dimension	L1 Encrypted P2P	L2 Redis Bus	L3 Chat Bot	L4 GitHub Handoff	L5 Health Monitor
Core ability	E2E encrypted direct	High-freq broadcast	Human-observable	Persistent async	Fault self-heal
Latency	10-100ms	<1ms	100-500ms	seconds~hours	Check interval
Persistence	None (optional)	None (configurable)	Yes (chat history)	Yes (Issue history)	Yes (status logs)
Deploy dependency	No central node	Redis Server	Chat platform	GitHub account	Any channel
PII safety	Native	No	No	No	N/A
Human-visible	No	No	Native	Yes	Partial
Cross-timezone	No (must be online)	No (must be online)	Partial	Native	N/A
Bandwidth	Low (1:1)	High (1:N broadcast)	Medium	Low	Very low

L1 Encrypted P2P: When Nobody Can "See" the Data

L1 solves the hardest constraint: data containing PII (Personally Identifiable Information) must never exist as plaintext at any intermediate node during transmission.

How It Works

Agent-A                              Agent-B
  │                                    │
  ├─[1] Generate ECDH keypair          │
  ├─[2] Exchange public key ────────►  │
  │                         [3] Derive shared secret
  │                  ◄──────────── [4] Exchange public key
  │  [5] Derive shared secret (same)   │
  ├─[6] AES-256-GCM encrypt message    │
  ├─[7] Ciphertext ─────────────────►  │
  │                         [8] Decrypt │

Key technical properties:

End-to-end encryption: ECDH negotiates a shared secret, AES-256-GCM encrypts the payload. Intermediate nodes only see ciphertext.
Cross-firewall: Uses webhook push mode — the agent pushes to the peer's exposed HTTPS endpoint. No inbound ports needed.
Zero trust: Each agent pair independently negotiates keys. One key compromise doesn't affect other channels.
Use cases: Financial risk agents passing credit data, medical agents passing patient info, legal agents passing case materials.

# L1 communication example: encrypted P2P send
from agent_cluster_comm import P2PLayer

p2p = P2PLayer(agent_id="risk_agent_a")
p2p.exchange_public_key(peer="risk_agent_b", endpoint="https://agent-b:8443/key-exchange")

# Send encrypted message with PII — Redis can't see it, nobody can
p2p.send_encrypted(
    peer="risk_agent_b",
    payload={"user_id": "610102****", "credit_score": 720, "decision": "approve"}
)

L4 GitHub Async Handoff: Zero-Deploy Message Queue

This is the most "counter-intuitive" layer in the whole architecture — using GitHub Issues as a message queue.

Why GitHub Issues Work as a Message Queue

Message Queue Need	GitHub Issues Equivalent
Send message	Create Issue
Consume message	Read Issue + Close Issue
Message metadata	Labels, Assignees, Milestone
Message history	Issue comment thread
Message partitioning	Repository grouping
Consumer group	Assignee = consumer identity
TTL	Auto-close workflow

Core Advantages

1. Zero Deployment

No Redis. No Kafka. No RabbitMQ. Just a GitHub account and a repo. Your agents might be running on a laptop, a cloud function, or even a Raspberry Pi — if they can call the GitHub API, they can communicate.

2. Cross-Timezone, Naturally

Agent-A creates Issue #42 at 6 PM Beijing time. New York's Agent-C opens GitHub at 9 AM the next morning. Issue #42 is sitting there quietly, with the full context thread intact. No lost messages, no TTL expiration.

3. Human-Auditable

Issues are public (or visible within a private repo). A project manager can drop a comment right in the issue: "Direction looks right, keep going." No other message queue gives you that.

# L4 communication example: GitHub async handoff
from agent_cluster_comm import GitHubHandoffLayer

handoff = GitHubHandoffLayer(
    repo="yuzhaopeng-up/agent-cluster-comm",
    agent_id="research_agent_ny"
)

# Send async message — the other side will get it even when offline
handoff.send(
    target="research_agent_bj",
    subject="Q3 financial data analysis complete",
    body="## Analysis Results\n\n- A-share sector rotation cycle shortened to 4.2 days\n- Suggest watching New Energy + AI crossover track\n\nSee attachment for detailed data.",
    labels=["analysis", "q3-report", "priority-high"]
)

# Receiver consumes the next day
messages = handoff.receive(label="q3-report")
for msg in messages:
    process(msg)
    handoff.acknowledge(msg)  # Close the Issue

Failover: The Self-Healing Loop Driven by L5

This is the "immune system" of the 5-layer architecture. L5 doesn't carry business messages. It does one thing: monitor the health of other layers and trigger switchover when something goes wrong.

Failover Flow

                    Normal State
                        │
             ┌──────────▼──────────┐
             │  L2 Redis running    │
             │  Agents via L2 fast  │
             └──────────┬──────────┘
                        │
             ┌──────────▼──────────┐
             │  L5 heartbeat check  │
             │  ping Redis every 5s │
             └──────────┬──────────┘
                        │ 3 consecutive timeouts
             ┌──────────▼──────────┐
             │  L5 declares L2 down │
             │  Trigger failover    │
             └─────┬─────────┬─────┘
                   │         │
      ┌────────────▼──┐  ┌──▼────────────┐
      │ L3 alert human │  │ L4 takes over │
      │ "Redis is down"│  │ Auto-switch   │
      └───────────────┘  └──┬────────────┘
                            │ L5 keeps probing
               ┌────────────▼────────────┐
               │  L2 recovered?           │
               │  Yes → switch back, stop │
               │  No  → L4 continues,     │
               │        escalate alert    │
               └─────────────────────────┘

Key design principles:

Detect → Alert → Switch, kept separate: L5 detects → L3 tells the human → L4 auto-takes-over. The human is always in the loop.
Migrate back, don't dual-write: When L2 recovers, L4 stops accepting new messages. Remaining L4 messages get consumed, then we switch back. No message duplication.
Degrade, don't circuit-break: From L2 (millisecond-level) down to L4 (second-level). The system still works — just slower.

# L5 failover configuration
from agent_cluster_comm import HealthMonitor

monitor = HealthMonitor(
    check_targets={"L2_redis": {"type": "ping", "interval": 5, "threshold": 3}},
    failover_plan={
        "L2_redis_down": [
            {"action": "alert", "channel": "L3", "message": "Redis bus down, switching to GitHub async channel"},
            {"action": "switch", "from": "L2", "to": "L4"},
            {"action": "keep_probing", "target": "L2_redis", "on_recover": "switch_back"}
        ]
    }
)
monitor.start()

Communication Layer Decision Tree

When your agent needs to send a message, which layer should it use?

Does the message contain PII / sensitive data?
├── Yes → L1 Encrypted P2P (E2E encrypted, no intermediate nodes)
└── No
    ├── Broadcast to multiple agents?
    │   ├── Yes → L2 Redis Message Bus (pub/sub, <1ms latency)
    │   └── No
    │       ├── Need human visibility / approval?
    │       │   ├── Yes → L3 Chat Bot (human-observable, intervenable)
    │       │   └── No
    │       │       ├── Receiver might be offline / cross-timezone?
    │       │       │   ├── Yes → L4 GitHub Async Handoff (zero deploy, persistent)
    │       │       │   └── No → L2 Redis Message Bus (lowest latency)
    │       │       └── Zero-deploy environment?
    │       │           └── Yes → L4 GitHub Async Handoff
    │       └── (fallback: L2)
    └── Need to monitor other layers' health?
        └── Yes → L5 Health Monitor

Quick mnemonic:

Condition	Use This Layer
Contains PII	L1
Need broadcast	L2
Need human eyes	L3
Cross-timezone	L4
Fault prevention	L5

4 Composition Patterns: The Stacking Effect Between Layers

A single layer solves a single-dimension problem. Layer combinations solve real business scenarios.

Pattern 1: Real-Time Analysis Cluster (L1+L2+L3)

Scenario: 3 financial risk agents analyzing transaction flows in real time
     ┌──────────┐
     │ L1 Encrypted P2P │ ← Passing intermediate results with customer ID numbers
     └─────┬────┘
           │
     ┌─────▼──────┐
     │ L2 Redis Bus │ ← Broadcasting "anomaly signal detection complete" notifications
     └─────┬──────┘
           │
     ┌─────▼────────┐
     │ L3 Chat Bot   │ ← Pushing real-time alerts to the risk manager
     └──────────────┘

L1 ensures PII doesn't leak
L2 keeps all 3 agents in sync (millisecond-level)
L3 ensures human decision-makers are informed in real time

Pattern 2: Cross-Timezone Research Team (L2+L3+L4)

Scenario: Beijing, London, New York agents collaborating on research
     ┌──────────┐
     │ L2 Redis  │ ← Same-timezone agents collaborating at high speed
     └─────┬────┘
           │
     ┌─────▼────────┐
     │ L3 Chat Bot   │ ← Cross-timezone but human-visible discussions
     └─────┬────────┘
           │
     ┌─────▼──────────┐
     │ L4 GitHub Handoff│ ← Task handoff when New York agent is off-duty
     └────────────────┘

Beijing Agent-A creates an L4 Issue before signing off, handing off the queue
London Agent-B consumes L4 messages after starting work, then syncs with same-Europe agents via L2
New York Agent-C reports key findings to the team lead via L3

Pattern 3: Failover Chain (L2→L4, Triggered by L5)

Scenario: 24/7 production environment, Redis SPOF is unacceptable
     L5 keeps probing L2 ──failure──→ L4 takes over ──recovery──→ Switch back to L2

Already covered in the failover flow above. Core value: from "Redis dies, everything stops" to "Redis dies, things slow down but keep running."

Pattern 4: Secure Multi-Party Computation (L1+L5)

Scenario: 3 banks' agents jointly training a model, original data mutually invisible
     ┌──────────┐
     │ L1 Encrypted P2P │ ← Only exchange encrypted gradients, raw data stays local
     └─────┬────┘
           │
     ┌─────▼────────┐
     │ L5 Health Monitor│ ← Monitor node liveness, abort computation on anomaly
     └──────────────┘

L1 ensures data can't be intercepted in transit
L5 ensures graceful termination when any participant drops — no silent errors

FAQ

Q: Why not just use gRPC for everything?

A: gRPC is a great RPC framework, but it assumes both parties are online and the network is reachable. It doesn't solve: PII end-to-end encryption, cross-timezone persistence, human observability, or zero deployment. Each layer in the 5-layer stack covers a dimension that gRPC can't.

Q: Using GitHub Issues as a message queue — is the throughput enough?

A: L4's design goal isn't high throughput — it's "zero deploy + cross-timezone + persistence." When you need high throughput, use L2. When you need cross-timezone zero-deploy, use L4. GitHub API rate limit is 5000 requests/hour, which is more than enough for inter-agent async handoffs.

Q: Do I have to deploy all 5 layers?

A: No. Stack them as needed. The minimum deployment is just L4 (zero deploy). Add L2 for real-time, L1 for security, L3 for human collaboration, L5 for production.

Q: How does this differ from AutoGen/CrewAI's communication?

A: AutoGen and CrewAI each have one built-in communication model (conversational / sequential), great for quick prototypes. agent-cluster-comm provides the communication infrastructure layer — it works alongside them. When AutoGen agents need encrypted transport, use L1. When they need cross-timezone, use L4.

Quick Start

pip install agent-cluster-comm

from agent_cluster_comm import ClusterComm

# Minimal config: L4 zero-deploy mode only
comm = ClusterComm(
    agent_id="my_agent_001",
    layers={"L4": {"repo": "your-org/agent-messages", "token": "ghp_xxx"}}
)

# Send async message
comm.send(target="analyst_agent", subject="Data ready", body="Q3 report generated")

# Receive messages
for msg in comm.receive():
    print(f"From {msg.sender}: {msg.subject}")
    comm.acknowledge(msg)

Repo: https://github.com/yuzhaopeng-up/agent-cluster-comm · Apache 2.0 License

Agent Skills Open Source Ecosystem

agent-cluster-comm is the communication infrastructure component of the Agent Skills open-source ecosystem. Here's the full matrix:

Repo	Purpose	GitHub
financial-ai-skills	Financial AI skill pack: risk control, compliance, AML, and other professional Skill sets	github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	General agent skill framework: 114+ plug-and-play Skills covering docs, data, publishing, security	github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	Multi-agent cluster comm stack: 5-layer architecture, from encrypted P2P to GitHub async handoff	github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	Skill development framework: standardized spec, templates, testing, publish pipeline	github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	Fintech H5 demos: interactive AI showcases, ready for training & roadshows	github.com/yuzhaopeng-up/fintech-h5-demos

Five repos working together: skill-framework defines how Skills are built → teleagent-skills provides the general skill library → financial-ai-skills focuses on the finance vertical → agent-cluster-comm lets multiple Skill-driven agents collaborate securely → fintech-h5-demos makes everything tangible and demoable.

Stars, Forks, and PRs welcome. Every Issue is a vote for the future of multi-agent communication.

4-Phase Orchestration: 5 Universal Agent Skills with YAML-Driven Rules, Composable Components, and Graceful Degradation

兆鹏于 — Wed, 01 Jul 2026 12:34:54 +0000

4-Phase Orchestration: How 5 Universal Agent Skills Achieve YAML-Driven Rules + Composable Components + Graceful Degradation

When you're hard-coding your 3rd scoring if-else, maybe it's time to ask: can I move the rules into YAML and let the business change config instead of code?

The Problem: Why Do Agent Skills Keep Reinventing the Wheel?

Every Agent developer faces the same dilemma — every business scenario rewrites a similar pipeline:

Scoring: Extract features → Match rules → Calculate score → Generate report
Complaints: Extract ticket → Cross-validate → Pinpoint root cause → Archive
Querying: Understand intent → Build SQL → Execute query → Render chart

The skeleton is identical. What changes is only the "content" at each step. Yet every team builds pipelines from scratch.

teleagent-skills offers an answer: freeze the skeleton into 5 universal Skills with 4-Phase orchestration, and let business changes live in YAML config only.

Architecture Overview: 4-Phase Pipeline + 5 Universal Skills

2.1 4-Phase Orchestration Diagram

┌─────────────────────────────────────────────────────────────┐
│                    Upper Business Skill                      │
│  (Scoring Engine / Evidence Chain / Data Aggregator / ...)  │
└──────────┬──────────┬──────────┬──────────┬────────────────┘
           │          │          │          │
           ▼          ▼          ▼          ▼
    ┌──────────┐┌──────────┐┌──────────┐┌──────────┐
    │ Phase 1  ││ Phase 2  ││ Phase 3  ││ Phase 4  │
    │ Extract  ││ Analyze  ││ Generate ││ Archive  │
    │          ││          ││          ││          │
    │Info-     ││Data-     ││Report-   ││Archive-  │
    │Extractor ││Analyst   ││Generator ││Manager   │
    └────┬─────┘└────┬─────┘└────┬─────┘└────┬─────┘
         │           │           │           │
         ▼           ▼           ▼           ▼
    ┌─────────────────────────────────────────────────┐
    │          JSON Contract (Structured Data Contract) │
    │   phase1_output.json → phase2_input.json → ...  │
    └─────────────────────────────────────────────────┘

Core idea: each Phase is an independent component, and Phases pass data only through JSON contracts.

Any Phase can be replaced (want a more powerful Analyzer? Swap it out)
Any Phase can be skipped (degradation mode)
Any Phase can be reused (5 Skills share the same Extract component)

2.2 JSON Contract Example

{
  "phase": "extract",
  "skill": "scoring-engine",
  "output": {
    "entities": [
      {
        "name": "客户A",
        "type": "enterprise_customer",
        "attributes": {
          "annual_revenue": 50000000,
          "employee_count": 320,
          "industry": "制造"
        }
      }
    ],
    "metadata": {
      "extraction_time": "2026-07-01T10:30:00Z",
      "source": "CRM_API",
      "confidence": 0.92
    }
  },
  "next_phase": "analyze"
}

Three Design Principles, Deep Dive

Principle 1: YAML-Driven, Parameterized Rules

The traditional approach hard-codes scoring rules:

# Hard-coded — every business change means a code change
if customer.revenue > 10000000:
    score += 30
elif customer.revenue > 5000000:
    score += 20

teleagent-skills does it differently — all rules are externalized to YAML:

# scoring_rules.yaml
scoring_engine:
  name: "政企客户商机评分"
  version: "2.1"
  dimensions:
    - id: revenue
      name: "营收规模"
      weight: 0.30
      rules:
        - condition: "attributes.annual_revenue >= 100000000"
          score: 30
          label: "超大型"
        - condition: "attributes.annual_revenue >= 50000000"
          score: 20
          label: "大型"
        - condition: "attributes.annual_revenue >= 10000000"
          score: 10
          label: "中型"
    - id: industry
      name: "行业属性"
      weight: 0.25
      rules:
        - condition: "attributes.industry in ['金融','医疗']"
          score: 25
          label: "高价值行业"
    - id: growth
      name: "增长潜力"
      weight: 0.20
    - id: connectivity
      name: "接入成熟度"
      weight: 0.15
    - id: decision_chain
      name: "决策链清晰度"
      weight: 0.10
  thresholds:
    high: 70
    medium: 40
    low: 0

Business changed? Edit YAML. Need a new dimension? Add YAML. Zero code changes.

Principle 2: Composable Components — 4-Phase Orchestration + JSON Contracts

All 5 Skills share the same 4-Phase skeleton, but each Skill's Phase behavior differs:

Skill	Phase 1 Extract	Phase 2 Analyze	Phase 3 Generate	Phase 4 Archive
Scoring Engine	Extract scoring object attributes	Load YAML rules & match scores	Generate scoring report + recommendations	Archive scoring records
Evidence Chain	Extract evidence from multiple sources	Cross-validate + conflict detection	Generate evidence chain report	Archive validation records
Data Aggregator	Validate & clean raw data	Aggregation + YoY/MoM calculation	Output statistical report	Archive aggregated results
Visualization Renderer	Analyze data characteristics	Generate ECharts config	Render HTML/Dashboard	Cache chart assets
NL2Query	Extract query intent + entities	Build SQL + confidence assessment	Format query results	Record query logs

The power of composability: upper-level Skills can chain lower-level Skills on demand. For example, the data query gateway pipeline:

NL2Query(Phase1-2) → Data Aggregator(Phase1-2) → Visualization Renderer(Phase1-3)

One natural language query automatically flows through "understand → query → aggregate → visualize" end to end.

Principle 3: Graceful Degradation — When Sub-Components Fail

The 4-Phase architecture has built-in 3-tier degradation:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Phase 1   │────▶│   Phase 2   │────▶│   Phase 3   │────▶│   Phase 4   │
│   Extract   │     │   Analyze   │     │   Generate  │     │   Archive   │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                   │                   │
       ▼                   ▼                   ▼                   ▼
  ┌──────────┐       ┌──────────┐       ┌──────────┐       ┌──────────┐
  │ Return   │       │ Skip     │       │ Simplify │       │ Local    │
  │ raw      │       │ analysis │       │ template │       │ cache +  │
  │ input +  │       │ mark low │       │ raw data │       │ deferred │
  │ conf=0   │       │ confidence│      │ passthrough│      │ retry    │
  └──────────┘       └──────────┘       └──────────┘       └──────────┘

The core principle of degradation: I'd rather give the user a low-confidence result than crash with an error.

The 5 Skills, One by One

4.1 Scoring Engine

What it is: A multi-dimensional weighted scoring component driven by YAML rule configs.

Typical scenarios: enterprise opportunity scoring, vendor evaluation, customer churn prediction, partner tiering.

Input → Rule matching → Scoring output flow:

{
  "customer": "客户A",
  "total_score": 78,
  "grade": "A级-重点跟进",
  "dimension_breakdown": {
    "revenue": { "score": 20, "max": 30, "label": "大型" },
    "industry": { "score": 15, "max": 25, "label": "中价值行业" },
    "growth": { "score": 20, "max": 20 },
    "connectivity": { "score": 15, "max": 15 },
    "decision_chain": { "score": 8, "max": 10 }
  },
  "recommendation": "建议安排专属客户经理，优先推荐5G专网+云网融合方案"
}

4.2 Evidence Chain

What it is: A multi-source evidence cross-validation component that detects conflicts, evaluates confidence, and pinpoints root causes.

Data Source 1: Customer complaints    ──┐
                                         │     ┌──────────────────┐
Data Source 2: System alert logs      ──┼────▶│  Evidence Chain  │
                                         │     │  Phase2: Analyze │
Data Source 3: SLA monitoring data    ──┘     └────────┬─────────┘
                                                  │
                          ┌───────────────────────┐│
                          │ Cross-validation:      │
                          │ • Complaint: "2hr outage"
                          │ • Alert: "optical attenuation"│
                          │ • SLA: "99.1% availability" │
                          │ Conflict detected:      │
                          │ Complaint vs SLA surface contradiction│
                          │ Root cause="optical attenuation": 0.87│
                          └───────────────────────┘

4.3 Data Aggregator

What it is: A raw data re-processing component supporting validation/cleaning, aggregation, YoY/MoM, and TOP rankings.

Raw query results           Aggregator output
┌──────────────┐         ┌──────────────────────────────┐
│ 300 rows of  │───────▶ │ Monthly summary + YoY/MoM    │
│ detail data  │         │ TOP10 ranking                │
│ (region x    │         │ Anomaly flags (>2σ)          │
│  month)      │         │ Trend direction              │
└──────────────┘         └──────────────────────────────┘

4.4 Visualization Renderer

What it is: An automated rendering component that turns structured data into ECharts charts and Dashboards.

Chart type selection is automatic: time-series → line chart, categorical → bar/pie, multi-dimensional → radar.

4.5 NL2Query

What it is: A smart natural-language-to-structured-query conversion component.

User input: "华东区上月5G专网新增客户数"

Phase1 Extract:  intent="query", entities=[region="华东", time="上月", metric="5G专网"]
Phase2 Analyze:  Generate SQL + confidence 0.88
Phase3 Generate: Format results
Phase4 Archive:  Record query log

Confidence scoring: When confidence drops below a threshold, the output gets a "low confidence" warning and shows the SQL for human review.

Industry Use-Case Matrix

Industry	Scoring Engine	Evidence Chain	Data Aggregator	Viz Renderer	NL2Query
Finance	Customer credit rating	Anti-money-laundering multi-source verification	Transaction volume YoY/MoM	Risk control dashboard	"Check a customer's last 3 months of transactions"
Manufacturing	Supplier evaluation	QA vs. production line verification	Line OEE statistics	Capacity dashboard	"Check line A's yield this month"
Retail	Member value scoring	Data conflict detection	SKU sales aggregation	Sales heatmap	"Top sellers in East China"
Healthcare	Patient risk stratification	Diagnosis vs. lab verification	Department admission stats	Bed occupancy dashboard	"Available beds in cardiology"

Key insight: All 5 Skills adapt to different industries purely through YAML rule configs. The scoring engine code logic is identical — only the YAML files differ.

How This Fundamentally Differs from Existing Frameworks

Dimension	LangChain/LlamaIndex	AutoGen/CrewAI	teleagent-skills
Orchestration	Code-level Chain	Multi-Agent dialogue	4-Phase declarative orchestration
Rule management	Hard-coded in code	Described in prompts	YAML parameterized config
Degradation strategy	try-catch	Retry dialogue	Declarative degradation config
Business adaptation	Change code	Change prompts	Change YAML

Quick Start

git clone https://github.com/yuzhaopeng-up/teleagent-skills.git
cd teleagent-skills

# Scoring engine example
cp skills/scoring-engine/config/scoring_rules.yaml my_rules.yaml
# Edit my_rules.yaml to customize your scoring dimensions
# Copy the Skill directory into your Agent platform's skills directory

License: Apache 2.0

Agent Skills Open-Source Ecosystem

Repo	What It Does	GitHub
financial-ai-skills	Financial AI skill library: 104 scenarios, pure Python	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	5 universal business Skills: 4-Phase orchestration + YAML-driven rules	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	Multi-Agent cluster 5-layer communication architecture	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	Skill governance framework: L0-L4 classification + YAML templates + Python tools	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12 zero-dependency financial H5 dashboard demos	https://github.com/yuzhaopeng-up/fintech-h5-demos

Stop building pipelines from scratch. The 4-Phase skeleton is ready — you just need to write YAML.

Star teleagent-skills and let's standardize Agent Skills together.

104 Financial AI Skills, Zero API Cost: Pure Python, Millisecond Response

兆鹏于 — Wed, 01 Jul 2026 12:29:14 +0000

Zero-API Financial AI Skills Library: 104 Scenarios in Pure Python, Millisecond Response

The API Bill Problem in Financial AI

Here's a paradox I kept running into in financial AI: business teams want "instant and ready," but tech teams deliver "integrate the API first."

One invoice verification scenario — hook into a third-party OCR service, 0.3 RMB per call, 100K calls a day, that's 90K RMB/month. One financial report analysis need — call an LLM API, 0.8 RMB per request in token costs, batch process 500 reports during earnings season, burn 400 RMB in a day. And don't get me started on budget control and risk scoring — those high-frequency scenarios where API costs scale linearly with volume. By year-end reconciliation, your AI project's ROI might be worse than a traditional Excel macro.

This isn't a capability problem. It's an architecture decision problem.

The financial-ai-skills repo takes a different path: pure Python standard library, zero API cost, millisecond response, 104 financial scenarios ready to go.

Repo: https://github.com/yuzhaopeng-up/financial-ai-skills

1. The Full Picture: 104 Skills Architecture

financial-ai-skills isn't a demo project — it's an engineered financial AI skills library covering 23 major categories and 104 specific Skills, published as 7 standalone Python packages.

1.1 Layered Scenario Architecture

┌─────────────────────────────────────────────────────────┐
│                  financial-ai-skills                     │
│                   104 Skills / 23 类                     │
├──────────────┬──────────────┬────────────────────────────┤
│  Financial   │   Wealth     │   Risk & Compliance        │
│  Intelligence│   Management │                             │
│  6大类       │   8大类      │   9大类                     │
│  27 Skill    │  36 Skill   │   41 Skill                  │
├──────────────┴──────────────┴────────────────────────────┤
│            Infrastructure Skills (通用层)                 │
│   wecom-template-card │ customer-marketing               │
│   product-manual-rag  │ application-material-checker     │
└─────────────────────────────────────────────────────────┘

1.2 Seven Published Packages

Package	Domain	Skills	Core Capabilities
`financial-intelligence`	Financial Intelligence	27	Invoice verification, budget control, report analysis, tax calculation, cost accounting, fund forecasting
`wealth-management`	Wealth Management	36	Asset allocation, portfolio analysis, product recommendation, risk assessment, portfolio optimization
`risk-compliance`	Risk & Compliance	41	Enterprise risk assessment, AML detection, compliance review, credit scoring, alert monitoring
`wecom-template-card`	IM Output	-	Markdown to WeCom/Feishu/DingTalk template cards, one-click adaptation for all three IM platforms
`customer-marketing`	Customer Marketing	-	Customer profiling, precision marketing, churn prediction, campaign effectiveness
`product-manual-rag`	Product Knowledge	-	Product manual RAG retrieval, clause parsing, comparison recommendations
`application-material-checker`	Material Review	-	Account opening review, loan application check, compliance document verification

1.3 Key Design Principles

Zero external dependencies: All Skills rely solely on the Python standard library (json, re, datetime, math, decimal, etc.) — no third-party APIs or LLM services. This means:

Install and use immediately, no API key needed
Response times in milliseconds (no network I/O)
Cost is always zero, regardless of call volume
Deployable in intranet / offline environments

Built-in mock data: Each Skill package ships with complete mock datasets — experience full functionality without configuring a database. Swap in real data sources for production.

2. Hands-On: 5 Typical Scenarios

2.1 Invoice Verification — Instantly Check Authenticity

$ python financial_cli.py invoice 011001900111 12345678

Field	Value
Invoice Code	011001900111
Invoice Number	12345678
Verification Result	Invoice information matches
Issue Date	2025-03-15
Buyer Name	Beijing Technology Co., Ltd.
Amount	12,800.00
Tax	1,664.00
Total (incl. tax)	14,464.00
Status	Valid

Invoice verification is a high-frequency operation in financial shared service centers. The traditional approach hooks into the tax bureau API or a third-party OCR — dealing with network timeouts, rate limiting, and billing reconciliation. The financial-intelligence package does it differently: local rule engine + validation algorithms, entire process under 5ms.

Core code:

from financial_intelligence import InvoiceChecker

checker = InvoiceChecker()
result = checker.verify("011001900111", "12345678")
print(result.to_markdown())

2.2 Budget Control — Instant Overspend Alerts

$ python financial_cli.py budget 市场部

Category	Budget	Used	Utilization	Status
Ad Spend	500,000	523,400	104.7%	Over Budget
Event Execution	300,000	287,600	95.9%	Warning
Media Partnerships	200,000	178,300	89.2%	Normal
Brand Building	150,000	156,200	104.1%	Over Budget
Total	1,150,000	1,145,500	99.6%	Critical

The pain point with budget control isn't "can't calculate" — it's not fast enough, not pushed in time. Millisecond response from local Skills makes real-time budget gatekeeping possible.

from financial_intelligence import BudgetEngine

engine = BudgetEngine()
report = engine.check_department("市场部")
print(report.to_markdown())

2.3 Financial Report Quick Read — Key Metrics & YoY at a Glance

$ python financial_cli.py report 美的集团 2025

Metric	2025	2024	YoY Change	Trend
Revenue	4,023B	3,737B	+7.7%	Up
Net Profit (attributable)	385B	337B	+14.2%	Strong Up
Gross Margin	26.8%	25.3%	+1.5pp	Up
Net Margin	9.6%	9.0%	+0.6pp	Up
ROE	24.3%	22.8%	+1.5pp	Up

The report quick-read Skill supports metric extraction, YoY/QoQ calculation, trend judgment, and core conclusion generation.

from financial_intelligence import ReportReader

reader = ReportReader()
summary = reader.analyze("美的集团", year=2025)
print(summary.to_markdown())

2.4 Asset Allocation — Personal Wealth Management Engine

from wealth_management import WealthEngine

engine = WealthEngine()
allocation = engine.get_allocation("张伟")
print(allocation.to_markdown())

The wealth-management package has the most Skills of all 7 packages (36), covering the entire wealth management chain from risk profiling and asset allocation to portfolio analysis and product recommendations.

2.5 Enterprise Risk Assessment — Multi-Dimensional Risk Profiling

from risk_compliance import RiskEngine

engine = RiskEngine()
risk = engine.get_enterprise_risk("比亚迪")
print(risk.to_markdown())

The risk-compliance package has the broadest scenario coverage (41 Skills) — from enterprise risk assessment and AML rule detection to compliance document review, forming a complete local risk rule engine.

3. Architecture: How Does It Achieve Zero API + Millisecond Response?

3.1 Three-Layer Architecture

┌──────────────────────────────────────────┐
│            CLI / API Interface Layer      │
├──────────────────────────────────────────┤
│            Skill Business Logic Layer     │
│  Rule Engine + Decision Tree + Scoring   │
│  Model + Validation Algorithms           │
├──────────────────────────────────────────┤
│            Data Adapter Layer             │
│  Mock Data (built-in) | DB Adapter (swap)│
└──────────────────────────────────────────┘

3.2 Performance Benchmarks

Scenario	Skill Execution	Comparable API	Speedup
Invoice Verification	3ms	200-500ms	67-167x
Budget Control	5ms	300-800ms	60-160x
Report Quick Read	8ms	1000-3000ms	125-375x
Asset Allocation	12ms	500-1500ms	42-125x
Risk Assessment	15ms	2000-5000ms	133-333x

4. Quick Start

# Clone the repo
git clone https://github.com/yuzhaopeng-up/financial-ai-skills.git
cd financial-ai-skills

# Run directly, no dependencies to install (pure standard library)
python financial_cli.py --help

Yep, no pip install step. Because there are zero third-party dependencies.

5. IM Integration: From Skill Output to WeCom / Feishu / DingTalk

The wecom-template-card package solves the last-mile problem of pushing Skill output to IM messages:

from wecom_template_card import MarkdownCardBuilder

builder = MarkdownCardBuilder()
card = builder.from_markdown(
    title="Budget Overspend Alert",
    markdown_table=report.to_markdown(),
    platform="wecom"  # supports wecom / feishu / dingtalk
)

Same Skill output, zero modifications needed to push to all three IM platforms.

6. Use Cases

Internal financial tools: No need to send sensitive data to third-party APIs
High-frequency batch processing: 10K+ calls per day scenarios
Offline / intranet environments: Deployment environments without external API access
MVP rapid validation: Get the logic working first, then wire up real data
Training & education: Built-in mock data, students can start experimenting immediately with zero setup

Agent Skills Open Source Ecosystem

financial-ai-skills is the finance-industry vertical package in the Agent Skills open source ecosystem. The entire ecosystem follows a unified Skill specification — each repo works standalone, or can be composed together.

Repo	Role	GitHub
financial-ai-skills	Financial AI Skills Library: 104 scenarios in pure Python	https://github.com/yuzhaopeng-up/financial-ai-skills
teleagent-skills	General Agent Skills: 4-Phase orchestration + rule parameterization	https://github.com/yuzhaopeng-up/teleagent-skills
agent-cluster-comm	Multi-Agent cluster 5-layer communication architecture	https://github.com/yuzhaopeng-up/agent-cluster-comm
skill-framework	Skill governance framework: L0-L4 classification + YAML templates + Python tools	https://github.com/yuzhaopeng-up/skill-framework
fintech-h5-demos	12 zero-dependency financial H5 dashboard demos	https://github.com/yuzhaopeng-up/fintech-h5-demos

Start your zero-API financial AI journey with financial-ai-skills — clone and run, millisecond response, MIT license.

DEV Community: 兆鹏 于

AI Skill Practical Training Course — Student Step-by-Step Guide

AI Skill Practical Training Course — Student Step-by-Step Guide

Pre-Course Preparation (5 minutes)

1. Verify Your TeleAgent Is Ready

2. Download Course Resource Pack

3. Import Your First Skill (Pre-class Exercise)

Challenge Roadmap

Challenge 1: Information Extraction & Archiving (30 min)

Challenge Goal

Real-World Application

Challenge Steps

Step 1: Import Template (5 min)

Step 2: Understand Parameter Configuration (10 min)

Step 3: Modify for Your Own Scenario (10 min)

Step 4: Connect to Feishu Table (5 min, optional)

Challenge 2: Daily/Weekly Report Auto-Generation (30 min)

Challenge Goal

Real-World Application

Challenge Steps

Step 1: Import Template

Step 2: Configure Data Source (10 min)

Step 3: Connect Real Data Source (10 min)

Step 4: Customize Report Template (5 min)

Challenge 3: Material Audit Assistant (30 min)

Challenge Goal

Real-World Application

Key Concept: Audit Rule Table

Challenge 4: Knowledge Base Q&A Assistant (30 min)

Challenge Goal

Real-World Application

Two Ways to Configure Knowledge Base

Challenge 5: Permission Self-Check (30 min)

Challenge Goal

Why It Matters

Key Concept: Permission Audit

Challenge 6: Message Linker (30 min)

Challenge Goal

Real-World Application

Keyword Rule Configuration

Advanced Challenge: L2 Skills (90 min)

Challenge 7: Unified Entry Gateway (45 min)

Challenge Goal

Scenario

Key Concepts

Challenge 8: Multi-Agent Complaint Handling Flow (45 min)

Challenge Goal

Scenario

Flow Nodes

Post-Course Skill Certification

Certification Standards

Certification Method

Appendix

A. Quick Reference Card

B. Troubleshooting Guide

C. Post-Course Resources

Ten Layers of AI Skill Construction: A Systematic Framework from Prompts to Business Closed Loops

Ten Layers of AI Skill Construction: A Systematic Framework from Prompts to Business Closed Loops

Layer 1: Pure Prompt Skill — The Zero-Code Starting Point

Layer 2: Component Skill — Structured Enhancement with Resources

Layer 3: Workflow Skill — Multi-Step Decision Trees

Layer 4: Orchestration Skill — Multi-Agent Coordination

Layer 5: Security Skill — Permission Control and Protection

Layer 6: Scoring Skill — Rule Engine Parameterization

Layer 7: Validation Skill — Multi-Source Evidence Cross-Validation

Layer 8: Approval Skill — Human-in-the-Loop Risk Control

Layer 9: Composition Skill — Multi-Skill Orchestration Benchmark

Layer 10: Closed-Loop Skill — End-to-End Business Closed Loop System

How to Assess Your Skill Level

Upgrade Path from Layer 1 to Layer 10

Conclusion

SOE Compliant Office Agent Skill System: 20 Skills Across 3 Domains

SOE Compliant Office Agent Skill System: Full-Stack Practice of 20 Skills Across 3 Domains

Introduction: Why State-Owned Enterprises Need Dedicated Compliance Office Skills

Three-Domain Architecture: Document Operations + Compliance Security + Reporting Analysis

Three Key Differentiators

1. Built-in Compliance

2. Audit Trail by Default

3. National Standards Ready

Technical Implementation: Zero API Cost + Pure Python + Millisecond Response

DEV Community: 兆鹏于