This is a submission for the AI Agents Challenge powered by n8n and Bright Data
Building a Chat with GitHub with n8n and Bright Data
What I Built
I created an AI-powered GitHub Profile Summarizer using n8n and Bright Data. This agent takes a GitHub username as input, scrapes the user's public profile, and generates a concise HTML summary of their bio, top repositories, and contributions. It leverages Bright Data's web scraping capabilities and Mistral AI's language model to deliver a polished, human-readable output. The workflow handles invalid usernames gracefully, ensuring a robust user experience.
Demo
Watch the demo video showcasing the workflow generating a summary for a sample GitHub profile.
n8n Workflow
The workflow JSON is available in this GitHub Gist.
Technical Implementation
The agent is built using an n8n workflow with the following components:
-
Webhook: Receives a GET request with a
username
query parameter. -
Set Username: Extracts and sets the GitHub username, defaulting to
abhirajadhikary06
if none is provided. -
Validate Username: Uses regex (
^[a-zA-Z0-9][a-zA-Z0-9-]{0,37}[a-zA-Z0-9]$
) to ensure the username is valid. - Bright Data Scraper: Scrapes the GitHub profile using Bright Data's verified node.
-
Mistral AI: Uses the
mistral-large-latest
model with a prompt to summarize the scraped data into a 200-word Markdown summary. - Memory: Maintains a context window of 50 interactions for conversational continuity.
- AI Agent: Configured as a conversational agent with a prefix: "You are a helpful assistant summarizing GitHub profiles."
- Markdown to HTML: Converts the AI-generated Markdown summary to HTML.
- Chat Trigger: Supports chat-based input for interactive use cases.
- Error Handling: Returns a 400 status code with an error message for invalid usernames.
Bright Data Verified Node
The Bright Data verified node is central to the workflow, scraping the GitHub profile page (https://github.com/{{ $json.githubUsername }}
) using the dataset ID gd_lyrexgxc24b3d4imjt
. It reliably extracts structured data (bio, repositories, contributions) without triggering GitHub's rate limits or CAPTCHAs, thanks to Bright Data's proxy management. The scraped data is passed to the AI Agent for summarization.
Journey
Building this agent was a rewarding challenge. Integrating Bright Data's scraper required fine-tuning the dataset configuration to extract relevant profile data consistently. The Mistral AI model needed a precise prompt to produce concise summaries, which I iterated on to balance detail and brevity. Handling invalid usernames robustly was another hurdle, solved by tightening the regex validation. Learning to chain n8n nodes with AI and web scraping tools deepened my understanding of automation and data processing. The biggest lesson was the power of combining reliable data extraction (Bright Data) with intelligent processing (Mistral AI) in a seamless n8n workflow.
Top comments (0)