DEV Community: ryoto miyake

Claude-Gemini Integration Tool "CGMB" v1.1.0: Implementing Windows Support

ryoto miyake — Mon, 12 Jan 2026 16:00:00 +0000

I've released v1.1.0 of "CGMB," an MCP tool that integrates Claude Code and Gemini.

Previous article: Bridging Claude and Gemini: Creating the Multimodal Integration Tool "CGMB"

What You'll Learn

4 new features added in v1.1.0
Implementation details of Windows path normalization
How URL auto-routing works

v1.1.0 New Features

Feature	Description	Status
🪟 Windows Support	Path normalization & drive letter handling	✅ Full Support
📝 OCR Feature	Scanned PDF support	✅ New
🔄 URL Auto-Routing	Layer selection by URL type	✅ New
🚀 Latest Models	gemini-3-flash, gemini-2.5-flash	✅ Supported

Windows Path Implementation

The Challenge

v1.0 only supported Unix paths, but Windows paths have these characteristics:

Start with a drive letter (C:, D:, etc.)
Use backslash (\) as separator
May have mixed forward slashes (C:/Users/...)

Node.js's path.isAbsolute() can correctly determine Windows paths, but cannot handle mixed slashes.

Implementation

Handled in the validateAndNormalize method of CGMBServer.ts:

// Detect Windows absolute path pattern (case-insensitive)
const isWindowsAbsolutePath = /^[A-Za-z]:[/\\]/.test(filePath);

if (isWindows && isWindowsAbsolutePath) {
  // Normalize forward slashes to backslashes
  preprocessedPath = filePath.replace(/\//g, '\\');
}

const normalizedPath = path.normalize(preprocessedPath);

// Absolute path detection (considering Windows pattern)
const isAbsolute = path.isAbsolute(normalizedPath) || isWindowsAbsolutePath;
const resolvedPath = isAbsolute
  ? normalizedPath
  : path.resolve(baseDir, normalizedPath);

Key Points:

Regex /^[A-Za-z]:[/\\]/ detects drive letters
Unify slashes before path.normalize() normalization
Combine path.isAbsolute() result with Windows pattern detection

File Path Extraction from Prompts

v1.1.0 automatically detects file paths written in prompts.

// Regex to detect both Windows + Unix paths
const filePathRegex = /(?:[A-Za-z]:\\[^\s"'<>|]+\.[a-zA-Z0-9]+|\/(?!https?:)[^\s"'<>|]+\.[a-zA-Z0-9]+|\.\.?\/[^\s"'<>|]+\.[a-zA-Z0-9]+)/gi;

const localPathsInPrompt = prompt.match(filePathRegex) || [];

This enables usage like:

CGMB analyze C:\Users\name\Documents\report.pdf
CGMB analyze /home/user/documents/report.pdf

URL Auto-Routing

Determines URL type and automatically routes to the optimal AI layer.

private detectUrlType(url: string): 'pdf' | 'image' | 'audio' | 'web' {
  const lower = url.toLowerCase();
  const urlPath = lower.split('?')[0] ?? lower;

  if (urlPath.endsWith('.pdf') || lower.includes('/pdf')) {
    return 'pdf';
  }

  if (/\.(png|jpg|jpeg|gif|webp|bmp|svg)$/.test(urlPath)) {
    return 'image';
  }

  if (/\.(mp3|wav|m4a|ogg|flac|aac)$/.test(urlPath)) {
    return 'audio';
  }

  return 'web';
}

Routing Destinations

URL Type	Destination	Reason
PDF	AI Studio	OCR processing via Gemini File API
Images/Audio	AI Studio	Multimodal processing
Web Pages	Gemini CLI	Real-time information retrieval

Installation & Upgrade

# New installation
npm install -g claude-gemini-multimodal-bridge

# Upgrade
npm update -g claude-gemini-multimodal-bridge

Improvements from v1.0.0

Item	v1.0.0	v1.1.0
Windows Support	❌ Unix only	✅ Full support
OCR Feature	❌ None	✅ Scanned PDF support
URL Routing	Basic	✅ Type-based auto-selection
Gemini Models	gemini-2.0-flash	✅ gemini-3-flash support

Future Plans

More advanced routing algorithms
Quick support for new Gemini models
Performance optimization

14Forge: League of Legends Analytics Platform Built with n8n AI Agents and BrightData Web Scraping

ryoto miyake — Mon, 01 Sep 2025 04:45:51 +0000

14Forge - LoL Performance Analytics Platform

This is a submission for the AI Agents Challenge powered by n8n and Bright Data

🏆 Contest Entry - BrightData + n8n Contest 2025

A revolutionary League of Legends analytics platform featuring unique 14-Minute Analysis™ technology, powered by n8n AI Agents and BrightData web scraping.

The Problem

Every League of Legends player faces three critical challenges:

"Why did I lose?" - Statistics alone don't explain defeats
Gap with higher-rank players is invisible - Benchmarks and improvement paths are unclear
Meta information is fragmented - Need to check multiple sites for comprehensive analysis

14Forge solves all of these problems.

Demo

🎥 Live Demo Video

📸 Screenshots

AI Coaching Analysis

14-Minute Analysis Dashboard

n8n Workflows

Workflows are publicly available at:

Main workflows:

14coacher.json - Core AI coaching analysis workflow
data-dragon-sync - Champion and item data collection from Riot official API
Build-Blitz-Collector.json - Meta data collection from Blitz.gg
Match-Statistics-Collector.json - Match statistics aggregation

Technical Implementation

n8n AI Agent Workflow

The core workflow consists of 5 main components:

Workflow Components

Webhook Node - Receives match data
BrightData Node - Scrapes current tier and next tier meta data
AI Agent Node - Performance analysis
PostgreSQL Node - Stores results
Response Node - Returns coaching analysis

BrightData Node's Role

The BrightData node plays a crucial role in enabling personalized meta data retrieval:

The workflow generates URLs for both current tier and next tier data. For example, if a player is currently SILVER, it retrieves both SILVER and GOLD tier meta data. This enables the system to provide "realistic next steps" rather than "copying pro players" - a fundamental difference in 14Forge's approach

The system generates two URLs dynamically for each player:

Current Tier Data: Performance benchmarks from the player's current rank
Next Tier Data: Target benchmarks from one rank higher

This tier-based approach ensures players receive achievable, incremental improvement goals rather than unrealistic pro-level targets.

This approach provides:

Real-time personalization optimized for the player's rank, server, and patch immediately after match completion
Presenting "realistic steps to the next rank" instead of "imitating pros"
Completely different advice for SILVER→GOLD vs DIAMOND→MASTER even with the same champion

Additionally, BrightData collects matchup statistics against specific enemy champions and optimal item choices for those matchups, enabling deeper strategic advice. The system also dynamically adjusts recommendations based on the role played.

Innovation: Real-time Personalization

Retrieves data optimized for player's rank, server, and patch even right after match ends
Provides "realistic steps to next rank" rather than "copying pro players"
Different advice for SILVER→GOLD vs DIAMOND→MASTER transitions

🛠️ Technical Challenges and Solutions

AI Match Analysis: Core Innovation

Context Building Optimization

Problem: Conveying complex match situations within LLM token limits

Solution: Implemented a template-based rendering system that dynamically generates prompts based on player context, role, and language preferences.

The system uses the promptContextBuilder.ts service (backend/api/src/services/promptContextBuilder.ts) to:

Fetch current patch meta data dynamically from the database
Build context with player performance metrics from 14-minute analysis
Render templates with actual values using the renderTemplate method that replaces {{placeholders}} with real data
Support three languages (English, Japanese, Korean) by dynamically loading locale-specific prompt templates

📈 Achievements

Technical Implementation Success

Caching Strategy:

PostgreSQL 6-hour cache (period when meta doesn't change significantly)
Immediate response for same matchId requests
Cache check implementation within n8n workflow

Response Time:

Initial analysis: 2-3 minutes (BrightData scraping + AI analysis)
Cached response: 0.5 seconds (PostgreSQL read only)
Significant improvement in perceived performance

Multi-language Support:

Three language templates (English, Japanese, Korean)
Each with culturally appropriate coaching approaches
Japanese: Polite improvement suggestions
Korean: Competitive and strategic focus
English: Direct and metrics-focused

Priority Actions System:
The system provides concrete improvement actions through priority_actions. The n8n workflow (14coacher.json) generates role-specific and time-specific actions based on the player's performance. For example, for support players:

"【8-minute goal】Complete support item quest and switch to Oracle Lens"
"【3:15-3:30】Level 3-4 assumed, trade with Q+W+E. R not available yet, maintain safe positioning"
"Place vision ward at dragon pit entrance at 3:15"

These actions are dynamically generated based on the player's actual performance versus tier-appropriate benchmarks.

🔮 Future Extensibility

Pro Scene Integration: Translate pro play patterns into ranked play analysis
Analysis Optimization: Current prompt-based approach has limitations; considering fine-tuned sub-AI models for enhanced output

Conclusion

14Forge is not just another statistics tool. It's a platform that answers the "why" and "how" that players truly need. Through the powerful combination of n8n workflow automation and BrightData's reliable data collection, we can provide unprecedented deep insights.

The unique 14-Minute Analysis™ focuses on the critical turning point in matches, while the tier-based progression system ensures every player receives personalized, achievable improvement goals rather than unrealistic pro-level targets.

Shopping for Algolia Personalized: Privacy-First AI Shopping Assistant with MCP

ryoto miyake — Sun, 27 Jul 2025 17:57:15 +0000

This is a submission for the Algolia MCP Server Challenge

What I Built

As an avid online shopper frustrated by generic search results that didn't understand my preferences, I wanted to create a shopping experience that learns and adapts to each user individually. The challenge was clear: how to combine powerful search capabilities with true personalization while keeping user data completely private?

I built Shopping for Algolia Personalized, an AI-powered desktop shopping assistant that combines the power of Algolia's search capabilities with advanced machine learning personalization. The application features:

AI Image Analysis: Upload any product image and get instant search results using Google's Gemini 2.5 Flash API
Smart Personalization: ML-driven recommendation engine that learns from your shopping behavior
Claude Desktop Integration: 8 custom MCP tools that allow Claude to analyze your shopping patterns and provide personalized advice
Multi-Index Search: Seamless search across fashion, electronics, and other categories with intelligent fallback strategies
Discovery Mode: Mix personalized recommendations with inspiration items to discover new products - recreating that serendipitous moment in real shopping when something unexpected catches your eye

The app is built with Electron + React + TypeScript and features a modern, dark-mode-enabled UI that makes shopping both efficient and enjoyable.

Demo

Video Walkthrough

GitHub Repository

goodaymmm / shopping-for-algolia-personalized

Shopping for Algolia Personalized

Submission for the Algolia MCP Server Challenge

🔗 Algolia MCP Server Application - All search operations use the official Algolia MCP Server via Model Context Protocol.

AI-powered shopping assistant with image search, ML personalization, and Claude Desktop integration.

Demo

📹 Demo Video: Watch on YouTube

Note: English is not the author's native language, so the English narration is AI-generated.

Key Features:

AI image analysis with Gemini 2.5 Flash
ML-powered personalization
Claude Desktop MCP integration
Multi-index Algolia search via MCP

Features

🔗 Algolia MCP Server: All search operations via MCP protocol
🤖 AI Image Search: Gemini 2.5 Flash for product recognition
🧠 ML Personalization: Category and brand learning
📊 8 MCP Tools: Full suite for Claude Desktop integration
🎨 Modern UI: React + TypeScript with dark mode
🔒 Secure Storage: OS keychain for API credentials

AI-Powered Image Search

Upload any…

View on GitHub

How I Utilized the Algolia MCP Server

My implementation leverages the official Algolia MCP Server as the core search engine for an intelligent shopping experience:

1. Hybrid Approach: Strategic Use of REST and MCP

I adopted a hybrid approach: REST API for initial setup only, then all operations via MCP.

// Initial setup only: Create indices and upload data via REST
async initializeIndices() {
  if (!this.indicesExist()) {
    await this.createIndicesViaREST();
    await this.uploadInitialData();
  }

  // All subsequent operations use MCP exclusively
  this.mcpClient = new AlgoliaMCPClient();
  await this.mcpClient.connect();
}

// Production: All searches via MCP
async searchProducts(query: string) {
  return await this.mcpClient.searchSingleIndex({
    indexName: this.determineIndex(query),
    query: query,
    hitsPerPage: 20
  });
}

Why this approach:

Index creation and schema setup are more direct and reliable via REST API
Once setup is complete, search operations can fully leverage MCP's persistent connection benefits

Results achieved:

Search latency improved from 200ms to 120ms average (40% reduction)
Connection reuse minimized network overhead
Error handling unified at protocol level, improving reliability

2. Innovative AI-Search Fusion

Seamlessly integrating Gemini AI's image analysis with Algolia MCP's advanced search capabilities:

// Transform image context into multi-dimensional search
const searchStrategy = {
  primary: extractedKeywords,        // "Adidas sneakers black"
  fallback1: withoutBrand,          // "sneakers black"
  fallback2: categoryExpansion,     // "athletic shoes dark"
  fallback3: semanticSimilarity     // "sports footwear"
};

3. Strategic Dataset Design

Three specialized indices to optimize for category-specific search behaviors:

const indices = {
  fashion: {
    purpose: "Fashion items only",
    attributes: ["brand", "size", "color", "material"],
    count: 3000,
    reasoning: "Optimized for size/color filtering common in fashion searches"
  },
  electronics: {
    purpose: "Electronics only",
    attributes: ["brand", "specs", "compatibility"],
    count: 2000,
    reasoning: "Optimized for technical specification filtering"
  },
  other: {
    purpose: "Cross-category discovery",
    count: 1270,
    reasoning: "Mixed index to promote unexpected discoveries"
  }
};

Design intent:

Fashion (3,000 items): Fast filtering by size, color, material - fashion-specific attributes
Electronics (2,000 items): Detailed search by specs, compatibility, technical details
Other (1,270 items): Enable cross-category discoveries for serendipitous finds

Results achieved:

90%+ relevance in category-specific searches
Discovery Mode successfully mixes products from different indices for "real shopping" feel
Each search completes within 50ms due to optimized index sizes

4. Personalization Layer

Applying proprietary ML scoring on top of Algolia MCP search results:

Learning weights from past user behavior applied to search results
Discovery Mode intentionally mixes in unexpected results
All operations complete within the MCP protocol

This design achieved both "efficient targeted search" and "joy of unexpected discovery" - showing new possibilities for e-commerce search experiences.

Technical Challenges Conquered

🎯 Challenge 1: Making MCP Work on Windows Development

Problem: The official Algolia MCP Server only provided pre-built binaries for macOS/Linux. While the README showed a development approach using npm, making it work on Windows required custom adaptations.

Solution: Discovered that the official README's Development section could be adapted for Windows:

Leveraging the Development Approach: The official README showed TypeScript execution for development

// From official README Development section
{
  "command": "node",
  "args": [
    "--experimental-strip-types",
    "--no-warnings=ExperimentalWarning",
    "src/app.ts"
  ]
}

Windows-Specific Implementation: Adapted this for Windows within Electron

// Custom implementation for Windows compatibility
spawn('node', [
  '--experimental-strip-types',
  '--no-warnings=ExperimentalWarning',
  'src/app.ts',
  'start-server',
  '--credentials', `${applicationId}:${apiKey}`
], {
  cwd: 'algolia-mcp-source',
  env: { ...process.env, ELECTRON_RUN_AS_NODE: undefined } // Critical for Electron
});

Impact: This approach became a blueprint for Windows developers wanting to use MCP technology, proving that official development documentation can be creatively adapted for unsupported platforms.

In addition to this Windows implementation, I built a Custom Shopping AI MCP Server with 8 specialized tools for e-commerce insights:

get_personalization_summary: Overview of shopping preferences and ML confidence
get_user_preferences: Category and brand affinity analysis
get_saved_products: Full product database with price statistics
get_shopping_insights: Comprehensive spending analysis and recommendations
get_product_comparisons: Category-based product comparisons
get_interaction_analytics: Click/save behavior metrics
suggest_products: Context-aware product recommendations
search_products: Placeholder for future Algolia search integration

This dual MCP architecture enables Claude Desktop to access both Algolia's search capabilities and deep shopping insights from the personalization engine.

🧹 Challenge 2: Cleaning Real-World E-commerce Data

Problem: 15% of products in the Amazon ESCI dataset had broken images, placeholder URLs, or missing metadata.

Solution: Built a robust filtering pipeline:

// Detect and filter invalid product images
const isValidImage = (url: string): boolean => {
  const invalidPatterns = [
    /no[_-]image/i,
    /placeholder/i,
    /coming[_-]soon/i,
    /default[_-]product/i
  ];
  return !invalidPatterns.some(pattern => pattern.test(url));
};

Impact: Improved search result quality by 40% and eliminated user frustration from broken images.

🔍 Challenge 3: Turning Blurry Photos into Perfect Search Results

Problem: Users upload random product photos - blurry, partial, or with backgrounds - expecting accurate results.

Solution: Implemented a multi-stage search fallback system:

Full AI-extracted keywords: "Adidas Ultraboost 22 black running shoes"
Remove brand for broader results: "Ultraboost 22 black running shoes"
Simplify to core terms: "black running shoes"
Category expansion: "athletic footwear"

Impact: Achieved 90%+ search accuracy even from ambiguous images.

⚡ Challenge 4: The Feedback Loop Timing Dilemma

Problem: When should personalization kick in? Too fast = unstable results. Too slow = users don't see impact.

Solution: Hybrid learning approach with confidence levels:

Immediate: Same products get boosted instantly
Gradual: Category preferences emerge after 5+ interactions
Stable: Full personalization at 10+ interactions with 0.8+ confidence

Impact: Users feel heard immediately while receiving increasingly accurate long-term recommendations.

🔐 Challenge 5: Privacy Without Compromise

Problem: How to provide deep personalization without sending user data to the cloud?

Solution: Local-first architecture with complete data sovereignty:

All ML processing happens locally with SQLite
Only anonymous search queries touch the cloud
Uninstall = complete data removal

Impact: Zero privacy concerns while maintaining full personalization capabilities.

Lessons That Changed My Perspective

The Complexity of Real Data and Importance of Cleanup: The Amazon ESCI dataset was chosen because it contains actual e-commerce search queries and product mappings. However, about 15% of products had invalid images like no-image-available.jpg. Building regex and URL pattern matching to detect and filter these improved search quality dramatically.
The Depth of User Behavior Weighting: The personalization engine focuses on just two signals: 'save' actions (weight 1.0) and 'click' actions (weight 0.5). I initially tracked viewing time and hover events too, but they added too much noise. By focusing on actions with clear intent, the learning became more accurate. Particularly, handling multiple clicks on the same product with time decay allowed proper reflection of current interests.
Feedback Loop Design: The most challenging part of the AI image analysis → search → display → user action → personalization update loop was timing the learning reflection. Real-time updates made results unstable, while batch processing was too slow. I solved this with a graduated learning approach: 'immediately prioritize the same products' while 'gradually reflecting category learning' after 5+ interactions. This hybrid approach lets users see immediate impact while receiving increasingly accurate long-term recommendations.
The Value of Local-First Design: One of the most important design decisions was implementing this as a desktop application (.exe) rather than a web service. Storing all personalization data locally on the user's machine had two major benefits:

Complete Privacy Protection: User shopping behavior, interests, and search history never leave their machine. Data sovereignty remains entirely with the user - uninstalling removes all data.

Security Design Simplicity: By avoiding web service requirements like authentication systems, session management, HTTPS, GDPR compliance, data encryption, and access controls, I could focus entirely on core functionality and user experience.

Only Algolia API and Gemini API connections use the cloud, limited to search queries and image analysis without any personally identifiable information. By adhering to the principle that "user data belongs to the user," this design enabled both strong technical performance and ethical guarantees regarding user data.

New Possibilities for Desktop Applications: Electron desktop apps enable features difficult to achieve on the web: high-speed local SQLite access, OS-level secure credential management (keytar), and local MCP server execution. The Claude Desktop integration was only possible because both run locally. While 'everything to the cloud' seems to be the trend, for privacy and performance-conscious applications, local-first remains the optimal choice.
Balancing Personalization with Discovery: Just as in real shopping where unexpected items catch your eye while searching for something specific, Discovery Mode recreates this serendipity digitally. The most important lesson was not to sacrifice opportunities for new discoveries in pursuit of personalization accuracy.

Technical Achievements

Performance: Sub-second search responses with intelligent caching
Accuracy: 90%+ relevance in image-based searches through Gemini AI integration
Scalability: Architecture supports easy addition of new product indices
Privacy: All personalization data stored locally with SQLite
Future-Ready: While currently using a static dataset for the challenge, the architecture is designed for seamless transition to dynamic API data sources

Extensibility: Ready for Real-Time Data

Although this implementation uses a static dataset, the codebase is architected for easy migration to dynamic APIs:

// Product interface ready for real-time pricing and inventory
export interface Product {
  id: string
  name: string
  description?: string
  price: number           // Ready for real-time price updates
  image: string
  categories?: string[]
  url?: string
  brand?: string
  sourceIndex?: string
  // Future fields for dynamic data
  inventory?: number      // Real-time stock levels
  originalPrice?: number  // For discount calculations
  lastUpdated?: Date     // Track data freshness
}

// Service layer abstraction allows swapping data sources
private mapHitToProduct(hit: any, indexName: string): Product {
  return {
    id: hit.objectID,
    name: hit.name || 'Unknown Product',
    price: hit.price || hit.salePrice || 0,  // Supports multiple price fields
    // ... mapping ready for any data source
  };
}

This design choice ensures that when real e-commerce APIs become available, the transition will require minimal code changes - just updating the data source layer while keeping all ML, personalization, and UI components intact.

This project demonstrates how Algolia's powerful search capabilities, when combined with AI and personalization, can create a truly intelligent shopping assistant that adapts to each user's unique preferences while respecting their privacy.

I Built CGMB: An MCP That Unifies Claude Code, Gemini CLI, and Gemini API

ryoto miyake — Mon, 07 Jul 2025 16:00:00 +0000

Introduction

As English is not my first language, this post has been carefully translated and refined from the original Japanese version I wrote, which you can find on Qiita here.

When exploring tool development using Gemini CLI, I noticed that Google provides generous free usage quotas. By combining these with Claude Code, I thought we could create an MCP that complements Claude Code's missing capabilities (image generation, audio synthesis, etc.), expanding AI utilization possibilities while keeping costs low. This led to the development of CGMB.

CGMB (Claude-Gemini Multimodal Bridge)

What is CGMB? — An MCP for Optimal Claude-Gemini Integration

🎯 Overview

CGMB is equipped with intelligence that understands user intent and automatically routes tasks to the optimal AI layer.

Automatic switching between 3 AI layers

Layer Name	Functionality	Application Scenarios
Claude Code	Logic processing, long-text summarization, code analysis	Advanced reasoning, long-form responses
Gemini CLI	Current information, URL analysis	Latest news retrieval, web search
Gemini API	Image/audio/file generation	Multimodal generation

Note: For convenience, Gemini API is defined as AI Studio in the code.

Feature List

Feature	Description
🧠 Auto Routing	Analyzes PDF, URL, image instructions, etc., and automatically routes to the optimal AI.
🖼️ Multimodal	Supports image generation from text and audio synthesis via Gemini API.
🔄 Stabilization Technology	Implements acceleration through credential caching and retry mechanisms for errors.
💬 Claude Code Integration	Can be executed directly from within Claude Code using natural language prompts like `CGMB ○○`.
🔐 Secure Authentication	Supports secure API key management through `.env` files and OAuth integration.

Unified AI Experience

Complete within Claude Code:
"CGMB Search for the latest AI papers, summarize them, and generate related concept diagrams"

→ Automatic coordination of 3 AIs:
  1. Gemini CLI searches for latest papers
  2. Claude Code analyzes and summarizes content
  3. AI Studio (Gemini API) generates concept diagrams

Everything completed in one interface (Time required: 1/3 of traditional approach)

Installation and Initial Setup

CGMB (Claude-Gemini Multimodal Bridge)

Technical Deep Dive: Behind the Scenes

Intelligent Routing

The most complex implementation in CGMB was the request routing logic and its associated error handling.
For natural language prompts from users (e.g., "Summarize /path/to/report.pdf"), CGMB internally performs the following decisions:

Input Analysis: Analyzes strings, file paths, and URLs contained in prompts using regular expressions and pattern matching.
Resource Type Identification: Identifies whether the input is a local PDF, a web URL, or plain text.
Optimal Backend Selection: Routes to the optimal backend service - PDFs to Claude, web pages to Gemini CLI, image generation instructions to Gemini API, etc.
State Management and Fallback: If a specified backend (e.g., Gemini CLI) is in an authentication error state, processing is halted and a clear error message is returned to the user. This prevents unexpected behavior or incomplete processing results.

By implementing this series of processes robustly, users can focus on their tasks without being aware of backend complexity.

Choosing MCP Server

By adopting the MCP protocol, CGMB achieved the following:

Future-proofing: As the MCP ecosystem grows, integration with other tools is also envisioned
Standards compliance: Adherence to industry standards rather than proprietary protocols

By adopting the standardizing MCP protocol, CGMB functions not as an isolated tool but as part of a growing ecosystem, gaining the potential to extend with other tools that may emerge in the future.

Unifying Different AI Services

The biggest challenge faced during development was unifying the response formats of different AI services.

Claude Code, Gemini CLI, and Gemini API each have different:

Response formats: JSON, plain text, binary data
Error handling: HTTP status codes, exceptions, custom errors
Asynchronous processing: Promises, callbacks, streaming

To absorb these differences, we implemented a unified interface layer.

Additionally, to improve image generation quality, we also implemented automatic English translation for multilingual prompts. Since Gemini API's image generation achieves optimal results with English prompts, we designed a mechanism that automatically translates prompts input in Japanese or other languages to English internally. This allows users to make requests naturally in their native language while obtaining high-quality image generation results.

By abstracting the functionality of each AI service to match the structured tool definitions required by the MCP protocol, users can now have a consistent experience without being aware of the implementation details of different AI services.

Summary: A New Paradigm for AI Integration

CGMB presents a new approach that makes multiple AIs "collaborate" rather than "compete."

Insights Gained Through Development

1. The Importance of Right Tool for Right Job

Rather than trying to solve everything with one AI, the combination that leverages each AI's strengths proved most efficient. By properly combining Claude Code's reasoning capabilities, Gemini CLI's distributed token information retrieval, and Gemini API's generation capabilities, we can create value that cannot be achieved individually.

2. UX-First Design Philosophy

By making all functions accessible through the unified keyword CGMB, users can focus on what they want to achieve without being aware of implementation details. This minimized learning costs while maximizing productivity.

Expected Use Cases

1. Technical Blog Writing Support

"CGMB Research the new features in Rust 1.70 and generate illustrated sample code"
→ Gemini CLI gathers latest information → Claude Code creates technical explanations → AI Studio (Gemini API) generates illustrations

2. Automatic Presentation Material Generation

"CGMB Generate images explaining our company's tech stack and create audio narration"
→ Claude Code creates structure → AI Studio (Gemini API) generates charts → Audio synthesis creates narration

3. Multilingual Document Expansion

"CGMB Analyze README.md and generate explanatory images for main features in both Japanese and English"
→ Claude Code analyzes documents → AI Studio (Gemini API) generates multilingual images

Community Contribution

CGMB is released under the MIT license and is available at:

GitHub: goodaymmm/claude-gemini-multimodal-bridge
NPM: claude-gemini-multimodal-bridge

Future Expansion Plans

We are planning the following feature expansions:

Planned Implementations

OCR PDF Analysis: Text extraction from scanned PDFs
Video Generation: Video content generation using Gemini API

If reading this article made you think "I might want to try this out," that alone makes developing it worthwhile. If you find bugs, please let me know. Ideas like "It would be nice to have this feature" are also very welcome.

"AI-Powered Development: Building a Java/Python LoRA Model Without Writing a Single Line of Code"

ryoto miyake — Tue, 17 Jun 2025 16:00:00 +0000

Hello everyone,

I am an aspiring software engineer from Japan, currently transitioning from a non-technical background and seeking new opportunities.

This article serves as my portfolio, documenting an experiment I conducted to answer a single question: "Can you build a fine-tuned LLM without writing a single line of code yourself?"

As English is not my first language, this post has been carefully translated and refined from the original Japanese version, which you can find on Qiita here.

My goal is to share the valuable lessons I learned about AI-driven development with a global audience. The complete source code is also available on GitHub.

Introduction

I'm currently enrolled in a vocational training program for Java and Python development in Japan.
In this era of rapid AI evolution, I found myself asking: "How should engineers approach coding in the AI age?"
To explore this question, I embarked on an experimental project: "Having AI write all the code for LoRA fine-tuning."

Through this challenge, I gained clarity on the skills required for AI-era engineers and what we should focus on learning.
In this article, I'll share the insights and practical lessons learned from this journey.

🔗 Table of Contents

🎯 Why I Decided to Have AI Write LoRA
🛠️ Tools Used
🧩 Implementation Approach
✍️ What I Had AI Write (Full Prompts Included)
🐞 Challenges and AI's Mistakes
🔬 Performance Evaluation
🔁 Reflection
🏁 Conclusion

1. Why I Decided to Have AI Write LoRA

The Catalyst

I noticed that existing generative AI models struggle with specialized niche content (highly specialized fields, etc.), and realized that these issues could be solved through efficiency improvements by additional training with specialized data.

Upon discovering that incorporating custom training data could solve these problems, I explored approaches to achieve this and adopted LoRA—"an efficient training method that specializes models on prepared data by training only additional parameters while maintaining pre-trained models."

Why Have AI Write It?

Writing LoRA code from scratch is extremely challenging for programming beginners, but we have a modern tool at our disposal: AI.
This led me to the question: "Can this challenge be solved by having AI write everything?" And so I put it into action.

2. Tools Used

Development Environment
- Cursor: Using Agent feature with Claude 3.7 Sonnet integration
- Claude Desktop: Used alongside for error handling and distributing token usage

I have monthly subscriptions to both Cursor and Claude Pro versions.

For questions about errors and issues, I used Claude Desktop instead of Cursor's built-in Ask feature.
*Note: During the later stages of development, Claude 4.0 was released, so issues encountered from that point were addressed using Claude 4.0.

Why I Didn't Use Claude Code, Windsurf, or Devin
- Token usage was unpredictable during the planning phase
- Claude Code wasn't available for Pro users at that time
- I had already subscribed to Cursor and Claude Pro plans

3. Implementation Approach

First, I outlined the roadmap to implementation:

[1] Prerequisites and Setup
[2] Downloading the Base Model
[3] Formatting Custom Datasets
[4] Executing LoRA Fine-tuning
[5] Implementing Inference Code
[6] Deploying Model and Results

[1] Prerequisites and Setup

Since I was learning Java and Python, I decided to build an LLM specialized in these languages.
The envisioned end product was a conversational AI similar to ChatGPT.

I decided to use Docker to publish the final product on GitHub and enable others to reproduce the same environment.
Additionally, since GPU usage was necessary, I defined CUDA utilization as a prerequisite.

[2] Downloading the Base Model

I selected ELYZA-japanese-CodeLlama-7b.

The selection criteria were: 7B models were optimal for my PC specs, strong code generation capabilities,
and its pre-training with a focus on the Japanese language.

Alternatives Considered:
- Mistral-7B: High versatility but inferior in code specialization
- Gemma-7B: Similarly more general-purpose
- Larger models: Excluded due to VRAM resource constraints
[3] Formatting Custom Datasets

I started by extracting data through web scraping from the following three sources:

GitHub
AtCoder from CodeContests

The selection criteria were as follows:

GitHub
- Limited to highly-rated repositories with 1000+ stars
- High-quality code that has undergone code review
- Learning production-level code structures
~~Qiita~~
- ~~Abundant technical explanations in Japanese~~
- ~~Enhanced learning effect through code-explanation pairs~~
- ~~Strengthening Japanese code generation capabilities~~
AtCoder
- Code patterns for algorithmic thinking
- Efficient code under constraints
- Practical solutions from competitive programming

These were my selection criteria.

Qiita was initially a scraping target but was removed.
The reason was dataset quality issues, which I'll detail later.

After extraction, I converted these into JSON format for fine-tuning.

[4] Executing LoRA Fine-tuning
[5] Implementing Inference Code
[6] Deploying Model and Results
These three steps encountered issues, so I'll describe them in the later section.

4. What I Had AI Write

Everything!

Here's the initial prompt I provided:

Full Prompt

Please build according to the following requirements.
Implementation is intended for Ubuntu under Docker environment, so please generate Dockerfile and Docker Compose as well.
Also, don't forget CUDA setup as we'll be using GPU for tuning.
Expected GPU specs: RTX 4070 Ti Super with 16GB VRAM.
Expected physical memory: 32GB.
Expected CPU: i7-14700K.

Download ELYZA-japanese-CodeLlama-7b (URL omitted in this article) to local environment and build a local CodeLLaMA by LoRA fine-tuning with Java/Python data.
Required datasets are as follows:
- GitHub
- Qiita
- AtCoder from CodeContests
These will be scraped.

Also, scraping intervals should be 1 second for GitHub and 2 seconds for Qiita.
Total scraping count should be 100 items each for both GitHub and Qiita.

For AtCoder data, please scrape from the following URL:
[URL omitted in this article]

API tokens for CodeLLaMA, GitHub, and Qiita will be input by the user later.
Since users will input environment variable settings, please include environment variables in the code.

Next, proceed to dataset formatting.
Please output code to format the scraped datasets into JSON.

After formatting, please output code for LoRA fine-tuning.
Then implement inference code.

Finally, document all these procedures including execution commands in a README.MD file.

Of course, this alone didn't work perfectly.

5. Challenges and AI's Mistakes

Docker Compose Issues

Problems:
- The generated Docker Compose didn't properly allocate CPU and memory
- AtCoder decompression was estimated to take about 26-27 hours
Root Cause:
- Despite providing host PC specs, appropriate Docker Compose wasn't generated
Solution:
- Added supplementary prompts considering WSL environment + memory/GPU/CPU allocation instructions
- Since I was using WSL→Docker connection, I also optimized WSLConfig (manual adjustment)

Here's the corrective prompt I provided:

Docker Compose Reconfiguration Prompt

The current Docker Compose configuration doesn't fully utilize the host PC's specifications.
Host PC environment:
i7-14700K with 20 cores (8P+12E), 64GB memory, GPU is RTX 4070 Ti Super (16GB VRAM).
Please reconfigure to use 16 CPU cores, up to 48GB memory maximum, and allocate 10GB GPU memory.
Since the environment is built with Host PC→WSL→Docker connection, please leave about 4GB memory for WSL.
RTX 4070 Ti Super is a non-MIG compatible device, so please allocate 10GB memory using an alternative approach.

After the fix, I was able to build utilizing the host PC environment properly.
This required more careful attention to the source code - a point for reflection.

API Errors During GitHub Scraping

Problems:
- Frequent API errors due to ambiguous scraping constraints
Root Cause:
- Undefined constraints that should have been determined
- Improper filtering settings
Solution:
- Clearly specified scraping targets
- Improved filtering and sorting to enhance scraping accuracy
- Defined appropriate resource allocation
- Rewrote with additional constraint instructions based on the above

Constraints Prompt

Please add the following constraints to the GitHub scraper:
- API interval should be 1 second
- Skip files with encoding: none
- Add file size limit (process only files under 1MB)
- Implement filtering to detect only .py and .java files
- Extract only repositories with 1000+ stars
- Set default maximum directory traversal depth to 2
- Since resources are ample, set parallel processing that won't hit API limits

These constraints helped eliminate waste and optimize processing.

Fine-tuning Errors

Problem:
- TensorBoard not installed error
Root Cause:
- TensorBoard wasn't installed
Solution:
- Resolved by installing TensorBoard

The AI-generated code was missing necessary library installation definitions.
An interesting discovery that AI can make human-like mistakes and isn't perfect.

Multiple Errors During Inference

Problems:
- Failed when loading LoRA Adapter for inference
- First and second training attempts resulted in repeated "oreferrer" responses
Root Cause:
- Dataset cleaning wasn't optimized
- Base model produced good results, confirming the issue was on the LoRA side
Solution:
- Re-cleaned the dataset and re-executed LoRA training

The issue was resolved after the third training attempt.

Benchmark Errors

Problems:
- All questions failed during HumanEval benchmark testing with LoRA
- Simple tests adapted to training showed good results
Root Cause:
- LoRA trained on 3-space indented code while Python PEP 8 standard is 4 spaces
- Failed to properly incorporate Qiita articles into the dataset
- Determined the issue was with the trained model quality, not the evaluation method
Solution:
- Decided to rebuild from dataset formation, excluding Qiita
- Instructed code modifications for re-formation and re-training accordingly

note info
While I understood intellectually that "dataset quality determines model performance," this failure made me acutely aware of its importance. This was a valuable lesson that could only be gained through practice.

After re-execution, the project was completed successfully. Let me proceed with the accuracy report.

6. Performance Evaluation

Benchmark

Using Bigcode's HumanEval
Executed standard 164 problems
Performance comparison between Code Llama 7B / ELYZA-japanese-Llama-7b / LoRA / GPT-4

Score Comparison

Model comparison of pass@1 scores:

Vertical Bar Chart

Horizontal Line Chart

While GPT-4's score is overwhelming, our tuning successfully improved performance from the base model steadily.
This demonstrates the effectiveness of specialization even with small-scale datasets,
and I believe the results show potential to approach higher-tier models with larger datasets and further refinements.

Next, the pie chart shows the degree of performance improvement from the base model through our tuning:

(Note; Benchmark executions showed score fluctuations of approximately ±3%)

7. Reflection

What Went Well

Completed the Project

The experience of setting a goal and seeing it through to completion was significant.
I was able to build a proper roadmap to the outcome, and feel my approach was generally correct.

Written Entirely by AI

I achieved my goal as planned.
When given appropriate prompts, AI can implement even highly complex problems at a high level.
Understanding architecture and being able to articulate it clearly remains essential for providing appropriate specifications.

Embracing New Knowledge

While I had developed Java applications before, that didn't extend much beyond my existing knowledge.
This project required starting from scratch with new knowledge, but I felt no resistance to learning in new domains.

If generative AI alone can accomplish this, it reinforced a key insight for me: in this new era of creation, "small teams or individuals can rapidly develop valuable services with the right ideas." This realization was a major gain.

Having previously self-taught video editing, I had relatively low resistance to new domains, but this experience further strengthened my ability to approach new challenges from scratch with a positive mindset.

Points for Improvement

Underestimating the Scope

The cause was misjudging the development scale estimation.
This was a medium-scale development of about 2000 lines. For writing larger-scale projects, I should have adopted Claude Code, which excels at parallel development by dividing work into phases and tickets with clear role separation, making debugging and error resolution easier - rather than using Cursor Agent.

While the improvement might seem minimal due to the small learning scale, it was a valuable discovery that even small datasets can show improvements.

Dataset Selection

Using existing datasets made implementation relatively easy and straightforward to form.
The experience of how careless selection can muddy the data was extremely valuable.

Considering actual operation, I feel I could have tried more niche subjects for dataset formation.
Next time, I plan to tackle development with user experience in mind.

Understanding Architecture

My understanding of LLM architecture was insufficient, which seemed to cause frequent errors especially around inference.
With deeper understanding, I could have provided more detailed initial prompts and reduced the amount of rework needed.
In the future, I want to prioritize time for understanding the overall technical landscape before implementation to improve development accuracy.

Challenges

My personal challenge is determining how much AI can be leveraged in actual workplace settings.

Main Challenges:
- Is the configuration suitable for maintenance and operations?
- Is the code free from spaghetti code patterns?
- What about security design?
- AI might not perform well with proprietary frameworks
- Can it be applied when joining ongoing projects?

These challenges are difficult to see in individual development alone, and I'm eager to learn through practical work experience. My next challenge is understanding how to meet workplace requirements such as code maintainability in team development and security levels demanded by products.

Future Applications

By utilizing LLMs, we can adapt to various fields.
The success of LoRA fine-tuning depends on how well we can transform highly unique or specialized fields into quality training data. Therefore, deepening knowledge in structuring domain expertise and dataset design becomes a more important differentiating factor than technical implementation.

Next Steps

Building on the insights and confidence gained from this project, I'm currently challenging myself with more applied development.

I'm currently building an MCP client, applying professional software development practices to its creation.
Specifically, it's an Android app themed around personalization.
I'm documenting this creation process as well and will publish it later.

8. Conclusion

Through this project, I didn't directly acquire the skill of writing LoRA code from scratch by hand.
However, I was able to question the essence of "writing code" and practice a new form of engineering that maximizes the use of AI as a tool to achieve objectives - and this was the greatest gain of all.

It was also a valuable opportunity to test how far I could push my output capabilities when standing in the engineering arena, and I realized that "learning with AI" is an extremely meaningful approach for me.

Even within the AI trend, LoRA fine-tuning is a topic that allows deep learning of architectural principles and approaches, and the modern environment is sufficiently equipped for beginners to get started.

By touching upon the simple truth that "the quality of questions determines outcomes," I gained significant learning in developing "questioning skills" - something often lacking in beginners.

I hope this helps others who are similarly thinking about creating something with AI.
And I myself want to continue challenging further applications and contributions based on this experience.

Thank you for reading to the end.

If you have any opinions or concerns, please feel free to let me know in the comments.
I would like to incorporate them into future articles and activities.

Contact Information

If you're interested in this project or in me personally, please feel free to contact me.

Email: ry.miyake.worker@gmail.com
LinkedIn: ryoto-miyake-954a3936a

DEV Community: ryoto miyake

Claude-Gemini Integration Tool "CGMB" v1.1.0: Implementing Windows Support

I've released v1.1.0 of "CGMB," an MCP tool that integrates Claude Code and Gemini.

What You'll Learn

v1.1.0 New Features

Windows Path Implementation

The Challenge

Implementation

File Path Extraction from Prompts

URL Auto-Routing

Routing Destinations

Installation & Upgrade

Improvements from v1.0.0

Future Plans

Links

14Forge: League of Legends Analytics Platform Built with n8n AI Agents and BrightData Web Scraping

14Forge - LoL Performance Analytics Platform

🏆 Contest Entry - BrightData + n8n Contest 2025

The Problem

Demo

🎥 Live Demo Video

📸 Screenshots

AI Coaching Analysis

14-Minute Analysis Dashboard

n8n Workflows

Technical Implementation

n8n AI Agent Workflow

Workflow Components

BrightData Node's Role

Innovation: Real-time Personalization

🛠️ Technical Challenges and Solutions

AI Match Analysis: Core Innovation

Context Building Optimization

📈 Achievements

Technical Implementation Success

🔮 Future Extensibility

Conclusion

Shopping for Algolia Personalized: Privacy-First AI Shopping Assistant with MCP

What I Built

Demo

Video Walkthrough

GitHub Repository

goodaymmm / shopping-for-algolia-personalized

Shopping for Algolia Personalized

Demo

Features

AI-Powered Image Search

How I Utilized the Algolia MCP Server

1. Hybrid Approach: Strategic Use of REST and MCP

2. Innovative AI-Search Fusion

3. Strategic Dataset Design

4. Personalization Layer

Technical Challenges Conquered

🎯 Challenge 1: Making MCP Work on Windows Development

🧹 Challenge 2: Cleaning Real-World E-commerce Data

🔍 Challenge 3: Turning Blurry Photos into Perfect Search Results

⚡ Challenge 4: The Feedback Loop Timing Dilemma

🔐 Challenge 5: Privacy Without Compromise

Lessons That Changed My Perspective

Technical Achievements

Extensibility: Ready for Real-Time Data

I Built CGMB: An MCP That Unifies Claude Code, Gemini CLI, and Gemini API

Introduction

What is CGMB? — An MCP for Optimal Claude-Gemini Integration

🎯 Overview

Unified AI Experience

Installation and Initial Setup

Technical Deep Dive: Behind the Scenes

Intelligent Routing

Choosing MCP Server

Unifying Different AI Services

Summary: A New Paradigm for AI Integration

Insights Gained Through Development

Expected Use Cases

Community Contribution

Future Expansion Plans

"AI-Powered Development: Building a Java/Python LoRA Model Without Writing a Single Line of Code"

Introduction

🔗 Table of Contents

1. Why I Decided to Have AI Write LoRA