DEV Community

Cover image for I Built "seoextract": A Python CLI Tool That Audits Website SEO from the Terminal
Britto K
Britto K

Posted on

I Built "seoextract": A Python CLI Tool That Audits Website SEO from the Terminal

SEO tools are useful, but many of them are either too heavy, too expensive, or too dependent on dashboards.

So I built seoextract — a simple Python CLI package that audits a website directly from the terminal and gives a clear SEO report with scores, grades, and actionable issues.

The goal was simple:

«Enter a website URL. Get an SEO audit. Understand what needs to be fixed.»


What is "seoextract"?

"seoextract" is a Python-based SEO audit tool that crawls a website and checks common SEO problems such as:

  • Missing or weak title tags
  • Missing meta descriptions
  • Thin content
  • Missing canonical tags
  • Poor internal linking
  • Missing schema markup
  • Missing viewport meta tag
  • Basic page-level SEO quality

It then calculates a score and grade for the website.

Example output:

seoextract audit https://www.python.org --max-pages 1

Output:

SEO Audit Complete

URL: https://www.python.org
Pages crawled: 1
Site score: 75.0
Grade: B
Total issues: 3
Critical: 0
Warnings: 2
Info: 1


Why I Built It

Most beginners learn SEO as a checklist:

  • Add a title
  • Add a meta description
  • Use headings properly
  • Add internal links
  • Add schema markup
  • Improve content length

But when building real websites, manually checking every page becomes boring and repetitive.

I wanted to build something that could automate the basic audit process.

At the same time, I wanted this project to help me improve my Python skills, especially in:

  • Web scraping
  • CLI development
  • Package structuring
  • SEO rule design
  • Publishing Python packages to PyPI

That is how "seoextract" started.


How It Works

The workflow is straightforward.

First, the user runs a command from the terminal:

seoextract audit https://example.com

Then "seoextract" performs these steps:

  1. Fetches the page HTML
  2. Parses the page content
  3. Extracts SEO-related elements
  4. Applies built-in SEO rules
  5. Calculates a score
  6. Displays a structured audit report

The tool checks both technical SEO signals and content-level signals.

For example, if a page has no meta description, the tool reports it as an issue and gives a fix suggestion:

[WARNING] Missing Meta Description
fix: Add a meta description between 50–160 characters summarising the page.


Example Audit

Running:

seoextract audit https://example.com

May return something like:

Audit Summary

site_score : 59.0
grade : D
pages_crawled : 1
total_issues : 7
safe_browsing : True

Detected issues:

[WARNING] Title Too Short
fix: Title is 14 chars. Expand to at least 50 characters.

[WARNING] Missing Meta Description
fix: Add a meta description between 50–160 characters summarising the page.

[WARNING] Thin Content
fix: Page has only 21 words. Aim for at least 300 words of meaningful content.

[INFO] Missing Canonical Tag
fix: Add a canonical tag to prevent duplicate content issues.

[INFO] Poor Internal Linking
fix: Add at least 2 internal links to help search engines discover related pages.

[INFO] No Schema Markup
fix: Add Schema.org structured data to improve search result appearance.

This makes the report beginner-friendly because it does not just say what is wrong. It also tells what needs to be fixed.


Features

The current version includes:

  • Website SEO auditing from the terminal
  • Page crawling with max-page control
  • SEO issue detection
  • Score and grade calculation
  • Human-readable fix suggestions
  • CLI interface
  • PyPI package support

The command format is simple:

seoextract audit --max-pages

Example:

seoextract audit https://www.python.org --max-pages 1


What I Learned While Building It

This project taught me that even a simple CLI tool needs proper structure.

At first, web scraping looks like it can be done in just a few lines of Python using BeautifulSoup.

But a real tool needs more than that.

It needs:

  • Input validation
  • Error handling
  • HTML parsing
  • URL normalization
  • Rule-based checks
  • Scoring logic
  • Clean terminal output
  • Package configuration
  • CLI command registration

That is why a proper project structure matters.

A 5-line script can scrape a title.

But a package should be reliable, reusable, and understandable.


Why This Project Matters

"seoextract" is not trying to replace advanced SEO platforms.

Instead, it is useful for:

  • Beginners learning SEO
  • Developers checking their websites
  • Students building portfolio projects
  • Freelancers auditing small websites
  • Python learners practicing real-world CLI tools

It is a practical project because it combines programming with a real business use case.

SEO is not just a technical topic. It connects directly to traffic, visibility, leads, and marketing.

That makes this project more useful than a basic toy script.


Future Improvements

The next versions can include:

  • LLM-based SEO suggestions
  • Better scoring rules
  • Export to JSON, CSV, or PDF
  • More detailed page reports
  • Broken link detection
  • Keyword density analysis
  • Image alt text checking
  • Sitemap detection
  • Robots.txt analysis
  • Better multi-page crawling
  • AI-generated recommendations

One important improvement I am planning is to add optional LLM support so the tool can generate more detailed recommendations based on the page content.

For example, instead of only saying:

Missing meta description

It could suggest:

Suggested meta description:
Learn Python programming with tutorials, documentation, downloads, and community resources from Python.org.

That would make the tool much more useful for real users.


Final Thoughts

Building "seoextract" helped me understand how real developer tools are structured.

It is not just about writing scraping code.

It is about turning code into a usable product:

  • A CLI command
  • A package
  • A report system
  • A scoring engine
  • A tool that someone else can install and use

This project started as a simple SEO checker, but it became a strong learning experience in Python packaging, CLI development, and practical automation.

If you are learning Python, I highly recommend building small CLI tools like this.

They force you to think beyond code and start thinking like a product builder.

Top comments (0)