Evan Lin

Posted on Jan 11 • Originally published at evanlin.com on Jan 11

[n8n][Gemini] Building an AI-Powered RSS Summary System with Daily LINE Notifications

#gemini #automation #productivity #tutorial

Background

As an information-anxious engineer, I track multiple tech blogs and Hacker News every day. But manually browsing is too time-consuming, so I decided to use n8n to build an automated system: Automatically fetch webpage content when RSS updates, generate summaries with Gemini AI, save them to Google Sheets, and then push selected articles to LINE at 6 AM every day.

This project integrates multiple services:

📡 RSS Feed: Subscribe to multiple information sources
🕷️ Firecrawl: Fetch complete webpage content
🤖 Gemini 2.5 Flash: AI automatic summarization
📊 Google Sheets: Store article data
📱 LINE Messaging API: Flex Message push notifications

It sounds great, but I encountered many pitfalls during the implementation. This article records the problems I encountered and the solutions.

System Architecture

The entire system is divided into two independent n8n Workflows:

Workflow 1: RSS Real-time Processing

RSS trigger → Format data → Firecrawl fetch webpage → Content preprocessing → Gemini summary → Write to Google Sheets

Workflow 2: Daily Scheduled Sending

Trigger at 6:00 AM daily → Read Google Sheets → Filter unsent → Take 10 items → Combine Flex Message → LINE push → Update status

Problems Encountered During Development

Problem 1: n8n Code Node Syntax Error

I initially used ES Module syntax in the Code Node:

// ❌ Incorrect approach
export default async function () {
  const items = this.getInputData();
  // ...
}

As a result, n8n kept reporting errors and failing to execute.

Solution: Use n8n's standard writing method, directly using $input.all():

// ✅ Correct approach
const items = $input.all();

const newItems = items.map(item => {
  // Processing logic
  return {
    json: {
      ...item.json,
      // Add fields
    }
  };
});

return newItems;

Problem 2: Gemini API Returns MAX_TOKENS Error

After sending the request, Gemini returned this result:

{
  "candidates": [
    {
      "content": { "role": "model" },
      "finishReason": "MAX_TOKENS",
      "index": 0
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 568,
    "totalTokenCount": 867,
    "thoughtsTokenCount": 299
  }
}

At first, I thought the input was too long, but after looking closely, promptTokenCount was only 568. The problem was with the output token limit!

It turns out that Gemini 2.5 Flash has a Thinking function, which consumes a portion of the output tokens for internal thinking. I set maxOutputTokens: 300, but thinking used 299, and the actual output was only 1 token!

Solution: Increase maxOutputTokens or disable the Thinking function:

// Solution 1: Increase output token limit
{
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 1024 // Increase from 300 to 1024
  }
}

// Solution 2: Disable Thinking function
{
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 512,
    "thinkingConfig": {
      "thinkingBudget": 0 // Disable thinking
    }
  }
}

Problem 3: Firecrawl Fetched Too Much Content

Firecrawl fetches the entire webpage, including navigation bars, sidebars, comment sections, and other noise. Sending it directly to Gemini would waste tokens and affect the quality of the summary.

Solution: Before sending it to Gemini, clean up the content with a Code Node:

const items = $input.all();
const maxLen = 1500; // Limit the maximum number of words

const newItems = items.map(item => {
  const title = item.json.title || '';
  const raw = item.json.content || '';

  // 1. Remove noise
  let text = raw
    .replace(/```

[\s\S]*?

```/g, '') // Remove code blocks
    .replace(/`[^`]+`/g, '') // Remove inline code
    .replace(/!\[[^\]]*\]\([^)]*\)/g, '') // Remove markdown images
    .replace(/<[^>]+>/g, '') // Remove HTML tags
    .replace(/https?:\/\/\S+/g, '') // Remove URLs
    .replace(/\[([^\]]+)\]\([^)]+\)/g, '$1') // Keep link text
    .replace(/[#>*`|_~]/g, '') // Remove markdown symbols
    .replace(/\n{3,}/g, '\n\n') // Compress line breaks
    .replace(/\s{2,}/g, ' ') // Compress whitespace
    .trim();

  // 2. Cut off irrelevant content
  const cutPatterns = [
    'Leave a Reply', 'Recent Comments', 'Related Posts',
    'Share this', 'Subscribe', 'Newsletter', 'Copyright',
    '關於作者', '延伸閱讀', '相關文章', '留言'
  ];

  for (const pattern of cutPatterns) {
    const idx = text.indexOf(pattern);
    if (idx > 200) {
      text = text.slice(0, idx);
    }
  }

  // 3. Limit length, keep complete sentences
  text = text.slice(0, maxLen);
  if (text.length === maxLen) {
    const lastPeriod = Math.max(
      text.lastIndexOf('。'),
      text.lastIndexOf('！'),
      text.lastIndexOf('？'),
      text.lastIndexOf('. ')
    );
    if (lastPeriod > maxLen * 0.5) {
      text = text.slice(0, lastPeriod + 1);
    }
  }

  // 4. Compose a concise prompt
  const prompt = `Write a summary of less than 100 words in Traditional Chinese, only outputting the summary text:

Title: ${title}

Content:
${text}`;

  return {
    json: {
      ...item.json,
      prompt: prompt
    }
  };
});

return newItems;

Problem 4: LINE Flex Message Error "message is invalid"

LINE Push Message returns an error:

A message (messages[0]) in the request body is invalid

After checking the Flex Message JSON, I found that the title field of some articles was empty, resulting in "text": undefined. The LINE API does not accept empty text fields.

Root cause: The field name read from Google Sheets is not title, but col_1 (due to a problem with the header row settings).

Solution: Add fallback when building the Flex Message:

const items = $input.first().json.data || [];

const bubbles = items.map((item) => {
  // Correction: Check for multiple possible field names and provide default values
  const title = item.title || item.col_1 || item.link || 'No Title';
  const summary = item.summary || 'No summary content';
  const link = item.link || 'https://example.com';
  const source = item.source || 'Unknown';

  return {
    "type": "bubble",
    "size": "kilo",
    "body": {
      "type": "box",
      "layout": "vertical",
      "contents": [
        {
          "type": "text",
          "text": title, // Ensure there is always a value
          "weight": "bold",
          "wrap": true
        },
        {
          "type": "text",
          "text": summary, // Ensure there is always a value
          "size": "sm",
          "wrap": true
        }
      ]
    },
    // ...
  };
});

API Credential Settings

Firecrawl API Key

Select Header Auth in n8n:

Field	Value
Name	`Authorization`
Value	`Bearer fc-your-api-key`

Gemini API Key

Select Header Auth in n8n:

Field	Value
Name	`x-goog-api-key`
Value	`your-gemini-api-key`

⚠️ Note: Gemini uses the x-goog-api-key header, not a Bearer token!

LINE Channel Access Token

Select Header Auth in n8n:

Field	Value
Name	`Authorization`
Value	`Bearer your-channel-access-token`

Google Sheets Field Design

title	link	summary	source	created_at	sent
Article Title	URL	AI Summary	Source	Publication Time	FALSE

⚠️ Important: Make sure the header row is set correctly, otherwise the keys read by n8n will be in the format col_1, col_2!

LINE Flex Message Effect

The final Flex Message is in Carousel format, with one card per article:

┌─────────────────────────┐
│ 📝 DK │ ← Source tag + emoji
├─────────────────────────┤
│ Article Title │ ← Bold title
│ │
│ Summary content summary content summary │ ← 100-word summary
│ content summary content... │
├─────────────────────────┤
│ [Read Original] │ ← Button link
└─────────────────────────┘

Different sources have different colors and emojis:

📝 DK (Blue #4A90A4)
🔥 HN (Orange #FF6600)
🎮 Steam (Dark Blue #1B2838)
🇯🇵 LY Blog (Green #00C300)

Pitfalls Summary

Problem	Cause	Solution
Code Node execution failed	ES Module syntax incompatibility	Use `$input.all()` standard writing method
Gemini MAX_TOKENS	Thinking function consumes output token	Increase maxOutputTokens to 1024
Summary quality is poor	Too much webpage noise	Preprocess to remove irrelevant content
LINE message invalid	Flex Message has empty values	Add fallback default values
Google Sheets field name error	Header row not set correctly	Ensure the first row has the correct field names

Development Experience

This project taught me several important lessons:

Gemini 2.5's Thinking function consumes output tokens: If your output is truncated, first check thoughtsTokenCount, you may need to increase maxOutputTokens or disable thinking.
n8n Code Node should use the standard writing method: Avoid using export default or this.getInputData(), using $input.all() is the most stable.
Always handle null values: The data returned by the API may be missing fields, you must add a fallback when combining the output.
Preprocessing is important: The cleaner the content you send to AI, the better the summary quality, and the more tokens you save.
The field names of Google Sheets depend on the header row: If the key read is col_1, it means there is a problem with the header row.

This system now automatically pushes 10 selected articles to my LINE at 6 AM every day, so I can finally quickly grasp the technology trends during my commute! 🎉

DEV Community

[n8n][Gemini] Building an AI-Powered RSS Summary System with Daily LINE Notifications

Background

System Architecture

Workflow 1: RSS Real-time Processing

Workflow 2: Daily Scheduled Sending

Problems Encountered During Development

Problem 1: n8n Code Node Syntax Error

Problem 2: Gemini API Returns MAX_TOKENS Error

Problem 3: Firecrawl Fetched Too Much Content

Problem 4: LINE Flex Message Error "message is invalid"

API Credential Settings

Firecrawl API Key

Gemini API Key

LINE Channel Access Token

Google Sheets Field Design

LINE Flex Message Effect

Pitfalls Summary

Development Experience

References

Top comments (0)