Automate Website Health: Building a High-Performance Dead Link Checker
As developers, we know that "Link Rot" is an inevitable part of the web. Sites move, pages are deleted, and external resources vanish. But manually clicking through every link on a site to find a 404 is a developer's nightmare.
I built Dead Link Checker, a professional desktop tool designed to turn this hours-long chore into a background task that finishes in minutes.
Why Another Link Checker?
There are many online tools, but they often come with limits: "100 pages for free," "No private sites," or "No PDF reports." This tool is built to be:
- Unlimited: It's open-source. Crawl 10 or 10,000 pages.
- Authenticated: Supports Basic Auth and Session Cookies for staging environments.
- Report-Driven: Generates PDF, CSV, and TXT reports for immediate use/distribution.
Technical Highlights: How it Works
The core of the application is built using Python 3, leveraging the power of multi-threading to handle I/O-bound network requests efficiently.
1. Multi-threaded Crawling
We use concurrent.futures.ThreadPoolExecutor to spin up dozens of workers. This allows the tool to check multiple URLs simultaneously without blocking the main event loop.
# Snippet of the core logic
with ThreadPoolExecutor(max_workers=workers) as executor:
futures = {executor.submit(check_url, url, timeout): url for url in url_list}
for future in as_completed(futures):
result = future.result()
# Process results...
2. Modern GUI with CustomTkinter
Standard Tkinter looks like it's from the 90s. This app uses CustomTkinter for a modern, dark-mode-ready interface that feels premium and professional.
3. Smart Sitemap Support
Instead of a slow "brute-force" crawl, you can point it to a sitemap.xml. The tool parses the XML and audits exactly what the search engines see.
Use Cases for Developers
-
Pre-deployment Audits: Run a crawl on your staging environment (
localhostor a password-protected dev site) before you push to production. - Client Handover: After finishing a project, generate a PDF Health Report to prove to the client that the site is 100% functional.
- Migration Verification: Moving from WordPress to Next.js? Use the CSV export to compare your old links against the new structure.
How to Get Started
Quick Install (Windows)
Download the installer from our release page:
DeadLinkChecker_Setup_v2.0.4_x64.exe
Running from Source
git clone https://github.com/arif98741/deadlink-checker-python.git
cd deadlink-checker-python
pip install -r build_tools/requirements.txt
python src/deadlink_gui.py
Contributing
The project is structured with a clear separation between the core logic (deadlink_checker.py) and the UI (deadlink_gui.py).
I'd love to see the community add features like:
- Support for checking image assets (
<img>tags) - Headless CLI mode for CI/CD pipelines
- Integration with Slack/Discord webhooks
Check out the repo and let me know what you think!
Happy coding! Feel free to drop a star on GitHub if this helps your workflow! 🚀
Top comments (0)