Web applications often have directories and files that are not linked from the main pages. These paths can expose admin panels, backup files, logs, and config data. Automated content discovery tools like Gobuster use wordlists to test hundreds or thousands of paths quickly, and finding these before attackers do is a key part of web application security testing.
Using the Acme IT Support practice target on TryHackMe, you can see exactly how an attacker builds up knowledge of a target step by step, starting from a small fast scan and moving to deeper coverage with file extension checks.
Ethical Considerations
- Only scan systems you own or have written permission to test.
- Set clear scope limits before scanning, including target hosts, paths, time windows, and allowed methods.
- Start with safe scan settings to avoid breaking services.
- Handle found data with care. Do not take, share, or publish sensitive content from logs, backups, or archives.
- Remove IPs, tokens, credentials, usernames, and other sensitive details before sharing findings publicly.
- Report high-risk findings through proper disclosure channels.
- Follow all applicable laws, platform rules, and company policies.
Step 1: Run a Baseline Scan with a Small Wordlist
You can start with a small and fast wordlist to find common directories and files. This gives you quick results without waiting too long.
gobuster dir --url http://<target-ip>/ \
-w /usr/share/wordlists/SecLists/Discovery/Web-Content/common.txt
The dir mode looks for directories. The --url flag sets the target. The -w flag points to the wordlist file. Gobuster uses 10 threads by default and treats 404 responses as negative results.
The scan found 9 paths:
| Path | Status | Notes |
|---|---|---|
/assets |
301 | Redirect, static resources directory |
/contact |
200 | Contact page |
/customers |
302 | Redirect, possible user area |
/development.log |
200 | Sensitive, exposed development log |
/monthly |
200 | Monthly content endpoint |
/news |
200 | News section |
/private |
301 | Redirect, restricted area |
/robots.txt |
200 | Crawler exclusion file, useful for recon |
/sitemap.xml |
200 | Sitemap, reveals additional paths |
Finding /development.log at status 200 shows a high-risk misconfiguration. Development logs can contain stack traces, database queries, and sometimes credentials.
Remediation: Remove development logs from production servers. Use proper logging systems that store logs outside the web root. Add access controls if logs must be kept on the server.
Step 2: Expand Coverage with a Larger Wordlist
You can use a bigger wordlist to find less common paths. Adding more threads and filtering noise makes the scan faster and the output cleaner.
gobuster dir -u http://<target-ip>/ \
-w /usr/share/wordlists/dirb/big.txt \
-t 50 -b 404,403 --no-error
The -u flag sets the target URL. The -w flag points to the larger dirb wordlist with over 20,000 entries. The -t 50 flag increases threads for faster scanning. The -b 404,403 flag hides not-found and forbidden responses. The --no-error flag removes error output for cleaner results.
This scan found 2 new paths not seen in Step 1:
| Path | Status | Notes |
|---|---|---|
/cookie-test |
200 | New, cookie testing endpoint exposed |
/sitemap_xml |
200 | New, alternate sitemap path |
Using -b 404,403 removes noise and shows only useful results. Setting threads to 50 makes the scan much faster on stable targets.
Remediation: Remove internal test endpoints like /cookie-test from production. Use a single sitemap path and redirect alternates to avoid confusion.
Step 3: Deep Scan with File Extension Checking
You can add file extension checking to find backup files, config files, and other sensitive file types. This multiplies your test cases and gives wider coverage.
gobuster dir -u http://<target-ip>/ \
-w /usr/share/wordlists/dirbuster/directory-list-2.3-medium.txt \
-x txt,json,bak,zip,md
The -x flag adds each extension to every wordlist entry. For example, backup becomes backup.txt, backup.json, backup.bak, and so on. This helps you find files that directory-only scans will miss.
This scan found 1 critical path:
| Path | Status | Notes |
|---|---|---|
/tmp.zip |
200 | Critical, archive file exposed on web root |
Adding -x tests each wordlist entry with every extension, which gives much wider coverage. Finding /tmp.zip shows why extension scanning is important. Backup and temp files left in web-accessible paths are a common issue.
Remediation: Remove all backup and archive files from the web root. Use deployment scripts that clean up temp files. Store backups in secure locations outside the web server.
Key Findings
| Path | Status | Risk |
|---|---|---|
/development.log |
200 | High, may contain credentials or stack traces |
/tmp.zip |
200 | High, archive with unknown contents exposed |
/private |
301 | Medium, restricted area worth investigating |
/customers |
302 | Medium, potential user data area |
/cookie-test |
200 | Low-Medium, exposes internal test endpoint |
/robots.txt |
200 | Informational, reveals disallowed paths |
/sitemap.xml |
200 | Informational, additional path disclosure |
Summary
- Small wordlists are fast but miss many paths. You should layer scans with bigger wordlists to get better coverage.
- File extension scanning with
-xis needed to find backup files like.bakand.zip, and config leaks. - Filtering noise with
-bfor block status codes gives cleaner output for faster review. - A
200response means the content is accessible.301and302mean redirects worth following. Even403confirms a path exists. - Exposed files like
development.logandtmp.zipare real-world issues you will often see in penetration tests.
You can use these steps on your own assessments to find hidden content and improve your security posture. Always stay within scope and handle any sensitive data you find with care.
If you found this helpful, drop a like and share it with someone learning security. If you have questions, ran into something different in your own lab, or want to share your results, leave a comment below. Always happy to connect and talk about security, recon techniques, or anything AppSec related.
Feel free to connect with me on LinkedIn
Always open to connecting with people in security, development, or both. Whether you are building something, breaking something, or just getting started, feel free to reach out.



Top comments (0)