<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alexey Leshchenko</title>
    <description>The latest articles on DEV Community by Alexey Leshchenko (@alexey_leshchenko_fc0ec66).</description>
    <link>https://dev.to/alexey_leshchenko_fc0ec66</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3619356%2F328d66a1-f1c6-4bc8-941d-59ae4d58619c.jpg</url>
      <title>DEV Community: Alexey Leshchenko</title>
      <link>https://dev.to/alexey_leshchenko_fc0ec66</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexey_leshchenko_fc0ec66"/>
    <language>en</language>
    <item>
      <title>Free 17,500 LLM Requests a Day</title>
      <dc:creator>Alexey Leshchenko</dc:creator>
      <pubDate>Wed, 04 Feb 2026 12:33:18 +0000</pubDate>
      <link>https://dev.to/alexey_leshchenko_fc0ec66/free-17500-llm-requests-a-day-2an5</link>
      <guid>https://dev.to/alexey_leshchenko_fc0ec66/free-17500-llm-requests-a-day-2an5</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Rate Limits Kill Projects
&lt;/h2&gt;

&lt;p&gt;We’ve all been there. You’re building a bot or research tool, and just when it gets interesting, you hit a rate limit or your credits run out. Everything goes dark, and it's incredibly frustrating.&lt;/p&gt;

&lt;p&gt;The fix isn't finding one "perfect" free API. It’s about building a system that treats every provider as a disposable spare part. I built a Go-based gateway that handles 17,500+ requests a day for $0. Here’s how.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Backstory: Tired of Broken Bots
&lt;/h2&gt;

&lt;p&gt;I didn't actually want to write a Go service; I did it because I was sick of my antispam bot crashing.&lt;/p&gt;

&lt;p&gt;I started with Python and n8n, which worked for about five minutes. As traffic grew, the setup crumbled. Free models on OpenRouter changed weekly, and my bot would quit whenever an API vanished. I tried Cloudflare’s AI Gateway, but it disconnected under heavy load. To get 100% uptime on a budget, I had to build a tool I could actually control.&lt;/p&gt;

&lt;p&gt;The real hurdle was my hardware: a $3/month VDS with 700MB of RAM. Tools like LiteLLM used half my memory just idling. I needed a lightweight binary that could handle thousands of requests without a sweat.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Plan: Building a "Meta-Tier"
&lt;/h2&gt;

&lt;p&gt;Instead of relying on one provider, I grouped several free APIs into a "Meta-Tier." If one provider throttles or goes offline, the gateway instantly moves to the next one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5i2tfzfbwvlrwlm2dmi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz5i2tfzfbwvlrwlm2dmi.png" alt="Aggregation of Free Tiers" width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Capacity Breakdown:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Groq (Free)&lt;/strong&gt;: ~15,000 Req/Day (Llama 3.3 70B) — Industry-leading inference speed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini (AI Studio)&lt;/strong&gt;: 1,500 Req/Day (Gemini 1.5 Flash) — Massive context window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenRouter&lt;/strong&gt;: 1,000 Req/Day (GPT-OSS / Qwen) — Access to niche/experimental models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mistral (Exp)&lt;/strong&gt;: Variable Capacity (Mistral Small) — Excellent for complex logic fallback.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total: &lt;strong&gt;17,500+ Requests for $0.00/month&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Gateway Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdp1j5koln9gacyqx4p5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjdp1j5koln9gacyqx4p5.png" alt="How the Gateway Works - Request Flow" width="794" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a specialized load balancer designed for LLM-specific failures. Since we want to keep things lean, we avoid complex visual libraries and stick to a robust request flow:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Request Flow:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Client (Bot/App)&lt;/strong&gt; → Sends HTTPS request to Nginx.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nginx&lt;/strong&gt; → Proxies via Unix Socket to the Go Gateway.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go Gateway&lt;/strong&gt; → Performs internal Auth &amp;amp; Token check.

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sequential Rotator&lt;/strong&gt; → Picks the first available provider (e.g., Groq).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failover Logic&lt;/strong&gt; → If Provider A returns a 429 (Rate Limit), the Gateway instantly retries with Provider B (Gemini) or Provider C (OpenRouter).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logging&lt;/strong&gt; → Every success and failure is saved as structured JSON for monitoring.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Go?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvdr0v3juiky5tfkl20h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvdr0v3juiky5tfkl20h.png" alt="Go is lightweight" width="542" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The 700MB RAM limit dictated the architecture. Python is too bloated for this hardware. This Go gateway is a small binary that sips ~15MB of RAM, leaving the rest of the server for your actual apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Catching the Errors
&lt;/h2&gt;

&lt;p&gt;The "brain" is a Sequential Rotator that is "429-aware." When a provider returns a rate-limit error, the gateway catches it and retries with the next provider in milliseconds. Your application never sees the failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  🚀 Get it Running
&lt;/h2&gt;

&lt;p&gt;First off, clone &lt;a href="https://github.com/leshchenko1979/ai-gateway" rel="noopener noreferrer"&gt;https://github.com/leshchenko1979/ai-gateway&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Setup
&lt;/h3&gt;

&lt;p&gt;Copy the example config and add your API keys.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp config.yaml.example config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Install
&lt;/h3&gt;

&lt;p&gt;Skip Docker to save resources. Use the script to build and install the systemd service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./install.sh build
./install.sh install-service
sudo systemctl start ai-gateway
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Remote Deploy
&lt;/h3&gt;

&lt;p&gt;Deploy from your local machine straight to your server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp .env.example .env
SSH_HOST=your-server.com ./install.sh deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Monitoring
&lt;/h2&gt;

&lt;p&gt;The gateway logs everything in JSON. Run&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;journalctl -u ai-gateway -f 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to watch it swap providers in real-time as rate limits are reached.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it Out
&lt;/h2&gt;

&lt;p&gt;Once running, the stack works like a single OpenAI-compatible endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer YOUR_INTERNAL_TOKEN" \
  -d '{"model": "gpt-oss-120b", "messages": [{"role": "user", "content": "Hello!"}]}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By owning this layer, you've built a private "meta-tier" that’s more reliable than any single API on its own.&lt;/p&gt;

&lt;p&gt;See the repo: &lt;a href="https://github.com/leshchenko1979/ai-gateway" rel="noopener noreferrer"&gt;https://github.com/leshchenko1979/ai-gateway&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>go</category>
      <category>llm</category>
    </item>
    <item>
      <title>How I Built an MCP Server to Give AI Assistants Real Telegram Powers</title>
      <dc:creator>Alexey Leshchenko</dc:creator>
      <pubDate>Wed, 19 Nov 2025 17:15:17 +0000</pubDate>
      <link>https://dev.to/alexey_leshchenko_fc0ec66/how-i-built-an-mcp-server-to-give-ai-assistants-real-telegram-powers-28d</link>
      <guid>https://dev.to/alexey_leshchenko_fc0ec66/how-i-built-an-mcp-server-to-give-ai-assistants-real-telegram-powers-28d</guid>
      <description>&lt;p&gt;I've been working on AI integrations for a while, and one thing always bugged me: why can't AI assistants just... use Telegram like humans do? Search conversations, send messages, manage contacts - without all the complexity.&lt;/p&gt;

&lt;p&gt;Building &lt;strong&gt;fast-mcp-telegram&lt;/strong&gt; was my answer. It's a complete MCP server that lets AI assistants interact with Telegram naturally.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem I Needed to Solve
&lt;/h2&gt;

&lt;p&gt;Working with AI assistants, I kept running into limitations. They could analyze data, but actually using Telegram was clunky. Direct API calls required complex session management, bot frameworks were for user-facing chatbots, and search tools couldn't send messages or manage contacts.&lt;/p&gt;

&lt;p&gt;I needed something that gave AI assistants full Telegram capabilities in a natural way.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes This Different
&lt;/h2&gt;

&lt;p&gt;After trying various approaches, I built something specifically for AI assistants:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP Tools&lt;/strong&gt;: Direct Telegram access through the &lt;code&gt;invoke_mtproto&lt;/code&gt; tool and standard messaging/search functions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Bridge&lt;/strong&gt;: For no-code tools like n8n and Make.com that can't use MCP directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Setup&lt;/strong&gt;: Handles authentication and generates config files automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production Support&lt;/strong&gt;: Bearer tokens, session isolation, and proper error handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smart Search&lt;/strong&gt;: Multi-query support with deduplication and filtering for AI assistants&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full Messaging&lt;/strong&gt;: Send, edit, reply, share files, even message phone numbers not in contacts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Handling&lt;/strong&gt;: Works with URLs or local files, handles security and albums&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where This Fits In
&lt;/h2&gt;

&lt;p&gt;Other Telegram projects solve specific problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search tools are great for finding messages, but can't send replies or manage contacts&lt;/li&gt;
&lt;li&gt;Bot frameworks work for user-facing chatbots, but not for AI assistants needing programmatic access&lt;/li&gt;
&lt;li&gt;Other MCP servers connect specific tools; this brings the entire Telegram ecosystem to AI assistants via direct MTProto API access&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Try the live demo&lt;/strong&gt;: &lt;a href="https://tg-mcp.redevest.ru/setup" rel="noopener noreferrer"&gt;https://tg-mcp.redevest.ru/setup&lt;/a&gt; - log in and download a config file, no installation needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For developers&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;fast-mcp-telegram
fast-mcp-telegram-setup &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your_api_id"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--api-hash&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your_api_hash"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--phone-number&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"+123456789"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full docs: &lt;a href="https://github.com/leshchenko1979/fast-mcp-telegram/#readme" rel="noopener noreferrer"&gt;https://github.com/leshchenko1979/fast-mcp-telegram/#readme&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Use It For
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Daily news summaries&lt;/strong&gt;: n8n automation searches my subscribed channels for the last 24 hours and sends AI-summarized digests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smart spam detection&lt;/strong&gt;: Uses HTTP-MTProto Bridge to get full user profiles (beyond regular Bot API) for better spam scoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content creation&lt;/strong&gt;: AI assistants analyze my previous posts' formatting to maintain consistent style&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customer service&lt;/strong&gt;: AI can respond to Telegram inquiries instead of just reading them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research workflows&lt;/strong&gt;: Search across channels and summarize findings&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;This started as a solution to my own AI-Telegram integration frustrations. The MCP protocol makes it natural for AI assistants to use, and the full feature set enables real applications beyond just demos.&lt;/p&gt;

&lt;p&gt;If this sounds useful, try the demo - setup takes 2 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Demo&lt;/strong&gt;: &lt;a href="https://tg-mcp.redevest.ru/setup" rel="noopener noreferrer"&gt;https://tg-mcp.redevest.ru/setup&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/leshchenko1979/fast-mcp-telegram" rel="noopener noreferrer"&gt;https://github.com/leshchenko1979/fast-mcp-telegram&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://github.com/leshchenko1979/fast-mcp-telegram/blob/master/docs/" rel="noopener noreferrer"&gt;https://github.com/leshchenko1979/fast-mcp-telegram/blob/master/docs/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This post is AI-assisted. But nowadays, everything is, right?&lt;/em&gt;&lt;/p&gt;

</description>
      <category>telegram</category>
      <category>mcp</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
