<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: gudhalarya</title>
    <description>The latest articles on DEV Community by gudhalarya (@gudhalarya).</description>
    <link>https://dev.to/gudhalarya</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3282587%2Fb5d7decc-d788-4e57-b8b2-0559939612bd.png</url>
      <title>DEV Community: gudhalarya</title>
      <link>https://dev.to/gudhalarya</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gudhalarya"/>
    <language>en</language>
    <item>
      <title>I Built a GitHub Repo Analyzer with FastAPI and PostgreSQL – Live Demo + Source Code</title>
      <dc:creator>gudhalarya</dc:creator>
      <pubDate>Fri, 27 Jun 2025 20:18:29 +0000</pubDate>
      <link>https://dev.to/gudhalarya/i-built-a-github-repo-analyzer-with-fastapi-and-postgresql-live-demo-source-code-54od</link>
      <guid>https://dev.to/gudhalarya/i-built-a-github-repo-analyzer-with-fastapi-and-postgresql-live-demo-source-code-54od</guid>
      <description>&lt;p&gt;Have you ever wanted a quick way to analyze any GitHub repository and see things like stars, forks, contributors, and top contributors — all in one place?&lt;/p&gt;

&lt;p&gt;I did too. So I built RepoVista — a fully open-source GitHub repo analyzer that gives you visual insights about any public repository.&lt;/p&gt;

&lt;p&gt;This was more than just a project — it helped me level up my skills in FastAPI, PostgreSQL, and Docker, and taught me how to turn an idea into a working, sharable product.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is RepoVista?&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  RepoVista is a backend-heavy web app that lets you:
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Analyze any public GitHub repo
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Fetch stars, forks, contributors
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Display a “Wall of Fame” with contributor avatars
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Deliver real-time data using async API calls
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Run everything in Docker with production-ready structure
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Why I Built It&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I wanted to practice real-world API building, data processing, and backend-first thinking — and also create something that devs can actually use or extend.&lt;/p&gt;

&lt;p&gt;This wasn’t just for fun. I treated it like a product:&lt;br&gt;
CI/CD, error handling, database schema design, deployment — all included.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech Stack Used&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Backend: FastAPI, BeautifulSoup, GitHub API&lt;/p&gt;

&lt;p&gt;Database: PostgreSQL&lt;/p&gt;

&lt;p&gt;DevOps: Docker, GitHub Actions&lt;/p&gt;

&lt;p&gt;Frontend: HTML/CSS (basic — can be extended with React or Next.js)&lt;br&gt;
**&lt;br&gt;
Live Demo + Source Code**&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;👉 Live Demo&lt;/strong&gt;: &lt;a href="https://repovista.vercel.app" rel="noopener noreferrer"&gt;https://repovista.vercel.app&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;👉 GitHub Repo&lt;/strong&gt;: &lt;a href="https://github.com/DRAKEN-1974/repovista" rel="noopener noreferrer"&gt;https://github.com/DRAKEN-1974/repovista&lt;/a&gt;&lt;br&gt;
**&lt;br&gt;
How It Works (Simple Flow)**&lt;/p&gt;

&lt;p&gt;User inputs a GitHub repo like vercel/next.js&lt;/p&gt;

&lt;p&gt;FastAPI fetches data via GitHub API and scraping&lt;/p&gt;

&lt;p&gt;Data is structured and saved to PostgreSQL&lt;/p&gt;

&lt;p&gt;Frontend displays stats + contributors in a clean format&lt;/p&gt;

&lt;p&gt;Async logic ensures it’s fast, non-blocking, and scalable.&lt;/p&gt;

&lt;p&gt;What I Learned While Building It&lt;/p&gt;

&lt;p&gt;Async programming and API design with FastAPI&lt;/p&gt;

&lt;p&gt;Clean, maintainable database models with PostgreSQL&lt;/p&gt;

&lt;p&gt;How to responsibly scrape and use external APIs&lt;/p&gt;

&lt;p&gt;Writing Dockerfiles and creating CI pipelines with GitHub Actions&lt;/p&gt;

&lt;p&gt;Structuring a full stack backend-first project for real deployment&lt;br&gt;
**&lt;br&gt;
What’s Next?**&lt;/p&gt;

&lt;p&gt;I’m working on improving it with:&lt;/p&gt;

&lt;p&gt;GitHub OAuth (for private repo analysis)&lt;/p&gt;

&lt;p&gt;Time-series charts to show repo growth&lt;/p&gt;

&lt;p&gt;Next.js frontend for a more dynamic UI&lt;/p&gt;

&lt;p&gt;If anyone wants to collab or fork it — it's all open source and ready to roll.&lt;/p&gt;

&lt;p&gt;Let’s Connect!&lt;/p&gt;

&lt;p&gt;I’m actively looking for remote internships or freelance work, and I love building clean, useful tools.&lt;/p&gt;

&lt;p&gt;Feel free to reach out or connect here:&lt;br&gt;
GitHub: &lt;a href="https://github.com/DRAKEN-1974" rel="noopener noreferrer"&gt;https://github.com/DRAKEN-1974&lt;/a&gt;&lt;br&gt;
Email: &lt;a href="mailto:gudhalarya@gmail.com"&gt;gudhalarya@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;br&gt;
If you have ideas, suggestions, or questions, feel free to comment.&lt;br&gt;
And if you’re working on something cool, I’d love to see it too 🚀&lt;/p&gt;

</description>
      <category>fastapi</category>
      <category>webdev</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title># 🕸️ How I Built a Modern Web Scraper with FastAPI &amp; Next.js</title>
      <dc:creator>gudhalarya</dc:creator>
      <pubDate>Sat, 21 Jun 2025 15:37:03 +0000</pubDate>
      <link>https://dev.to/gudhalarya/-how-i-built-a-modern-web-scraper-with-fastapi-nextjs-1c2p</link>
      <guid>https://dev.to/gudhalarya/-how-i-built-a-modern-web-scraper-with-fastapi-nextjs-1c2p</guid>
      <description>&lt;p&gt;The main link of the project is:- &lt;a href="https://web-scraper-zdoy.vercel.app/" rel="noopener noreferrer"&gt;https://web-scraper-zdoy.vercel.app/&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Web scraping is one of the most powerful techniques for gathering data from the internet, whether you’re a developer, researcher, or data enthusiast. In this post, I’ll walk you through what web scraping is, why it’s useful, and how I built my own &lt;strong&gt;Modern Web Scraper&lt;/strong&gt; using &lt;strong&gt;FastAPI&lt;/strong&gt; (Python) for the backend and &lt;strong&gt;Next.js&lt;/strong&gt; (React/TypeScript) for the frontend. I'll also share my project structure, features, deployment approach, and tips for getting started!&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 What is Web Scraping?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Web scraping&lt;/strong&gt; is the process of automatically extracting information from websites. Instead of copying and pasting data manually, you can use code to fetch web pages and parse out the data you need. This is widely used for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Market price monitoring&lt;/li&gt;
&lt;li&gt;News aggregation&lt;/li&gt;
&lt;li&gt;Research and academic data collection&lt;/li&gt;
&lt;li&gt;SEO analysis (meta tags, headers, keywords)&lt;/li&gt;
&lt;li&gt;Competitive intelligence&lt;/li&gt;
&lt;li&gt;Archiving and more!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Always respect a website’s &lt;code&gt;robots.txt&lt;/code&gt; and Terms of Service. Scrape responsibly!&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ Tech Stack
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Backend: Python + FastAPI&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastAPI:&lt;/strong&gt; Fast, modern web framework for building APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requests:&lt;/strong&gt; For making HTTP requests to target websites&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BeautifulSoup:&lt;/strong&gt; For parsing and extracting content from HTML&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CORS Middleware:&lt;/strong&gt; To allow frontend-backend communication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployed on &lt;a href="https://railway.app/" rel="noopener noreferrer"&gt;Railway&lt;/a&gt;:&lt;/strong&gt; Simple, free deployment for Python APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Frontend: Next.js + React + TypeScript&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Next.js:&lt;/strong&gt; Framework for server-rendered React apps (easy deployment, SEO-friendly)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TypeScript:&lt;/strong&gt; Type safety for reliability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tailwind CSS:&lt;/strong&gt; Rapid UI styling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployed on &lt;a href="https://vercel.com/" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt;:&lt;/strong&gt; The best way to host Next.js apps&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📁 Project Structure
&lt;/h2&gt;

&lt;p&gt;My project is split into two main sections:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/backend         # FastAPI backend (main.py, requirements.txt)
/src/app         # Next.js frontend (page.tsx, layout.tsx, CSS)
/public          # Frontend static assets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ✨ Key Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scrape any public website&lt;/strong&gt; by entering its URL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSS Selector support:&lt;/strong&gt; Target specific elements (e.g. &lt;code&gt;h1&lt;/code&gt;, &lt;code&gt;.class&lt;/code&gt;, &lt;code&gt;#id&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extract all links or images&lt;/strong&gt; from a page&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta tag extraction:&lt;/strong&gt; View meta, Open Graph, Twitter, and canonical tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP headers viewer:&lt;/strong&gt; Inspect the response headers of any web page&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Export results&lt;/strong&gt; as TXT, CSV, or JSON&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configurable:&lt;/strong&gt; Set timeout, User-Agent, follow links (crawl depth), and more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modern UI:&lt;/strong&gt; Responsive, clean, and easy to use&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy-friendly:&lt;/strong&gt; No data is stored; all processing is local or via your backend&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚙️ How Does It Work?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; You enter a URL (and optionally a CSS selector) in the web app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Call:&lt;/strong&gt; The frontend sends your request to the FastAPI backend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scraper:&lt;/strong&gt; The backend fetches the page using &lt;code&gt;requests&lt;/code&gt;, parses it with &lt;code&gt;BeautifulSoup&lt;/code&gt;, and extracts the desired content, links, images, or meta tags.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Results:&lt;/strong&gt; The data is sent back to the frontend for display, export, or further analysis.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🚦 How to Use the Modern Web Scraper
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enter a URL:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Example: &lt;code&gt;https://example.com&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;(Optional) Add a CSS Selector:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;h1&lt;/code&gt; for all h1 headings
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.product-title&lt;/code&gt; for elements with class "product-title"
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;#main&lt;/code&gt; for the element with ID "main"
&lt;/li&gt;
&lt;li&gt;Leave blank to get the entire HTML&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tweak the Config (Optional):&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set request timeout (for slow websites)&lt;/li&gt;
&lt;li&gt;Change User-Agent (simulate different browsers)&lt;/li&gt;
&lt;li&gt;Enable "Follow Links" to crawl linked pages&lt;/li&gt;
&lt;li&gt;Enable "Include Metadata" to extract meta tags&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Scrape:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click "Scrape" and see instant results in the UI&lt;/li&gt;
&lt;li&gt;Use "Meta Tags" and "Headers" buttons to inspect SEO and HTTP info&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Export or Copy Results:&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download as TXT, CSV, or JSON&lt;/li&gt;
&lt;li&gt;Copy to clipboard with one click&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🚀 How to Deploy Your Own Version
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Backend (Python/FastAPI)&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Push your &lt;code&gt;/backend&lt;/code&gt; folder to GitHub&lt;/li&gt;
&lt;li&gt;Deploy on &lt;a href="https://railway.app/" rel="noopener noreferrer"&gt;Railway&lt;/a&gt; (or Render, Heroku, Fly.io)&lt;/li&gt;
&lt;li&gt;Use start command:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  uvicorn main:app --host 0.0.0.0 --port $PORT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Frontend (Next.js)&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Push your code to GitHub&lt;/li&gt;
&lt;li&gt;Deploy on &lt;a href="https://vercel.com/" rel="noopener noreferrer"&gt;Vercel&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Set your backend API URL in &lt;code&gt;.env.local&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  NEXT_PUBLIC_API_URL=https://your-backend.up.railway.app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚠️ Limitations &amp;amp; Things to Know
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;JavaScript-heavy sites:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This scraper fetches static HTML only. If a website loads content with JavaScript (like most React/Angular sites), the scraped data may be missing. For full JS-rendered scraping, consider using Playwright or Selenium.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bot Protection:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Some websites block scrapers using CAPTCHAs, rate limits, or IP bans. Always scrape ethically and responsibly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💡 Lessons Learned &amp;amp; Next Steps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;FastAPI + Next.js = modern, scalable, and fun to build!&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Most scraping failures are due to JavaScript-heavy sites or anti-bot protections.&lt;/li&gt;
&lt;li&gt;Next steps: Add Playwright support for JavaScript rendering, user authentication, and Docker for even easier deployments.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💬 Try It Yourself!
&lt;/h2&gt;

&lt;p&gt;Want to see it live or check out the code?&lt;br&gt;
👉 &lt;a href="https://github.com/DRAKEN-1974/web-scraper" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt;  &lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Questions or feedback?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Drop a comment below or DM me on Twitter [&lt;a class="mentioned-user" href="https://dev.to/draken1974"&gt;@draken1974&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuu5qxjmm8stp8jz96ipp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuu5qxjmm8stp8jz96ipp.png" alt="Image description" width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
Happy scraping! 🕷️&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
