Week 4 Scripting Exercise: Build a Web Reconnaissance Report Generator
Time: 2-4 hours
Type: Free-response scripting project
Skills: HTTP Protocol, Python requests/parsing, Security Headers, Cookie Analysis
π― The Challenge
From Grace Nolan's Security Engineering Interview Notes:
"Web scrapers - Write a script to scrape information from a website."
Your task: Build a Python script that performs passive reconnaissance on a target URL and generates a security-focused report.
This is the discovery phase of security assessments - understanding what you're looking at before testing anything.
β This exercise is part of my open-source Security Engineering curriculum.
If you find this helpful, star the repo on GitHub to support the project and get notified when new exercises drop!
Background Reading
Before you start coding, review these concepts:
From Grace Nolan's Notes (Networking Section)
HTTP Response Headers contain:
- Status codes (1xx informational, 2xx success, 3xx redirect, 4xx client error, 5xx server error)
- Content type and encoding
- Server identification
Cookies:
-
HttpOnly- cannot be accessed by JavaScript (XSS mitigation) -
Secure- only sent over HTTPS -
SameSite- CSRF protection (Strict, Lax, None)
From Hacking APIs, Chapter 6: Discovery (pp. 125-147)
The passive reconnaissance process has three phases:
- Cast a Wide Net - Gather general information about the target
- Adapt and Focus - Refine based on findings
- Document the Attack Surface - Record everything useful
Key quote:
"Taking notes is crucial to performing an effective attack. Document and take screen captures of all interesting findings."
From Full Stack Python Security
Chapter 7 (pp. 86-89) - HTTP Cookies:
- Cookies are sent via
Set-Cookieresponse header - Session IDs are commonly stored in cookies
- The
Securedirective prevents transmission over HTTP - The
Domaindirective controls which domains receive the cookie
Chapter 14 (pp. 222-224) - Security Response Headers:
-
HttpOnlyhides cookies from JavaScript (document.cookie) -
X-Content-Type-Options: nosniffprevents MIME sniffing attacks - Missing security headers are common findings in assessments
From API Security in Action, Chapter 5 (pp. 151-153)
CORS Headers:
-
Access-Control-Allow-Origin- Which origins can access resources -
Access-Control-Allow-Credentials- Whether cookies are sent - Wildcard (
*) with credentials is a security misconfiguration
Your Assignment
Build a Python script called web_recon.py that:
- Takes a URL as input (command line argument or user prompt)
- Makes an HTTP request to the target
- Extracts and displays the following information:
Part 1: Basic Response Information
Extract and display:
βββ RESPONSE STATUS βββ
Status Code: 200
Status Message: OK
HTTP Version: HTTP/1.1
Response Time: 0.234 seconds
Why this matters: Status codes reveal application behavior. A 403 vs 404 can indicate whether a resource exists.
Part 2: Server Information (Technology Fingerprinting)
Look for headers that reveal server technology and display their values:
βββ SERVER INFORMATION βββ
Server: nginx/1.18.0
X-Powered-By: PHP/7.4.3
X-AspNet-Version: (not present)
X-Generator: WordPress 5.8
X-Drupal-Cache: (not present)
Headers to check:
ServerX-Powered-ByX-AspNet-VersionX-GeneratorX-Drupal-CacheX-Varnish
Why this matters: These headers reveal what software is running. Security analysts can then search for CVEs affecting those versions.
Note: Just report if these headers exist and show their values. Let the human analyst assess the risk - that's what real recon tools do.
Part 3: Security Headers Analysis
Check for presence/absence of security headers:
βββ SECURITY HEADERS βββ
X-Frame-Options: DENY β
X-Content-Type-Options: nosniff β
X-XSS-Protection: (not present) β οΈ
Strict-Transport-Security: max-age=31536000; includeSubDomains β
Content-Security-Policy: (not present) β οΈ
Referrer-Policy: strict-origin-when-cross-origin β
Permissions-Policy: (not present) β οΈ
Security Header Score: 4/7
Headers to check:
| Header | Purpose | Risk if Missing |
|---|---|---|
X-Frame-Options |
Prevents clickjacking | Clickjacking attacks |
X-Content-Type-Options |
Prevents MIME sniffing | Content type attacks |
X-XSS-Protection |
Browser XSS filter | XSS (legacy browsers) |
Strict-Transport-Security |
Enforces HTTPS | Downgrade attacks |
Content-Security-Policy |
Controls resource loading | XSS, injection |
Referrer-Policy |
Controls referrer info | Information leakage |
Permissions-Policy |
Controls browser features | Privacy issues |
Part 4: Cookie Analysis
Parse all Set-Cookie headers and analyze security attributes:
βββ COOKIES βββ
Cookie 1: session
Value: abc123...def (truncated)
HttpOnly: β Yes
Secure: β Yes
SameSite: Strict
Domain: .example.com
Path: /
Max-Age: 86400 (1 day)
Cookie 2: tracking_id
Value: xyz789
HttpOnly: β No β οΈ
Secure: β No β οΈ
SameSite: (not set) β οΈ
β οΈ FINDING: Cookie 'tracking_id' missing HttpOnly - vulnerable to XSS theft
β οΈ FINDING: Cookie 'tracking_id' missing Secure - sent over HTTP
β οΈ FINDING: Cookie 'tracking_id' missing SameSite - CSRF risk
Cookie attributes to extract:
- Name and value (truncate long values)
-
HttpOnlyflag (boolean) -
Secureflag (boolean) -
SameSitevalue (Strict/Lax/None or not set) -
Domainscope -
Pathscope -
Max-AgeorExpires
Part 5: CORS Configuration
Check for CORS headers and potential misconfigurations:
βββ CORS CONFIGURATION βββ
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Access-Control-Allow-Methods: GET, POST, DELETE
Access-Control-Allow-Headers: Content-Type, Authorization
π¨ CRITICAL: Wildcard origin (*) with Allow-Credentials is a security vulnerability!
CORS security rules:
-
*origin withcredentials: true= CRITICAL vulnerability - Overly permissive methods (DELETE, PUT) = note for testing
Part 6: Additional Reconnaissance
Extract any other useful information:
βββ ADDITIONAL INFORMATION βββ
Content-Type: text/html; charset=utf-8
Content-Length: 45678
Content-Encoding: gzip
Cache-Control: no-cache, no-store
ETag: "abc123"
Interesting Headers Found:
- X-Request-ID: req-12345 (useful for log correlation)
- X-RateLimit-Remaining: 99 (rate limiting detected)
- Via: 1.1 proxy.example.com (proxy detected)
Part 7: Generate Summary Report
End with an executive summary:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
RECONNAISSANCE SUMMARY
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Target: https://example.com
Scan Time: 2026-01-03 10:30:00 UTC
FINDINGS:
π¨ CRITICAL (1):
- CORS misconfiguration: wildcard with credentials
β οΈ MEDIUM (3):
- Missing Content-Security-Policy header
- Cookie 'tracking_id' missing HttpOnly
- Missing X-Frame-Options header
βΉοΈ INFO (4):
- Server: nginx/1.18.0
- X-Powered-By: PHP/7.4.3
- Rate limiting detected
- Proxy detected in request path
SECURITY HEADER SCORE: 4/7 (57%)
COOKIE SECURITY SCORE: 1/2 cookies properly secured (50%)
RECOMMENDED NEXT STEPS:
1. Search for CVEs: nginx 1.18.0, PHP 7.4.3
2. Test for clickjacking (X-Frame-Options present)
3. Test XSS vectors (no CSP)
4. Test CSRF on state-changing operations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Example Output
When run against a target, your script should produce output similar to:
$ python3 web_recon.py https://example.com
[*] Starting reconnaissance on: https://example.com
[*] Time: 2026-01-03T10:30:00Z
βββ RESPONSE STATUS βββ
Status Code: 200
Status Message: OK
Response Time: 0.342s
βββ SERVER INFORMATION βββ
Server: ECS (dcb/7F84)
X-Powered-By: (not present)
βββ SECURITY HEADERS βββ
X-Frame-Options: (not present) β οΈ
X-Content-Type-Options: (not present) β οΈ
Strict-Transport-Security: (not present) β οΈ
Content-Security-Policy: (not present) β οΈ
Security Header Score: 0/7
βββ COOKIES βββ
No cookies set.
βββ CORS CONFIGURATION βββ
No CORS headers present.
βββ RECONNAISSANCE SUMMARY βββ
Target: https://example.com
Findings: 4 missing security headers
Recommendation: Review security header configuration
Scoring Rules
Cookie Security Score
Each cookie is scored out of 3 points:
| Attribute | Points | Rule |
|---|---|---|
HttpOnly |
1 | Present = 1 point, Missing = 0 |
Secure |
1 | Present = 1 point, Missing = 0 |
SameSite |
1 |
Strict or Lax = 1 point, None or missing = 0 |
Cookie Security Score = (Total points earned) / (Total possible points) Γ 100%
Example:
Cookie 1: session β HttpOnly β, Secure β, SameSite=Strict β β 3/3
Cookie 2: tracking β HttpOnly β, Secure β, SameSite β β 1/3
Cookie 3: preference β HttpOnly β, Secure β, SameSite=Lax β β 1/3
Total: 5/9 = 55.6%
Cookie Security Score: 55.6%
Security Header Score
Score out of 7 points (1 point per header present):
| Header | Points |
|---|---|
X-Frame-Options |
1 |
X-Content-Type-Options |
1 |
X-XSS-Protection |
1 |
Strict-Transport-Security |
1 |
Content-Security-Policy |
1 |
Referrer-Policy |
1 |
Permissions-Policy |
1 |
Security Header Score = Headers present / 7 Γ 100%
Finding Severity Levels
Use these rules to categorize each finding:
π¨ CRITICAL
| Finding | Condition |
|---|---|
| CORS misconfiguration |
Access-Control-Allow-Origin: * AND Access-Control-Allow-Credentials: true both present |
β οΈ MEDIUM
| Finding | Condition |
|---|---|
| Missing Content-Security-Policy |
Content-Security-Policy header not present |
| Missing HSTS |
Strict-Transport-Security header not present (HTTPS sites only) |
| Missing X-Frame-Options |
X-Frame-Options header not present |
| Missing X-Content-Type-Options |
X-Content-Type-Options header not present |
| Cookie missing HttpOnly | Any cookie where HttpOnly is not set |
| Cookie missing Secure | Any cookie where Secure is not set (HTTPS sites only) |
| Cookie missing SameSite | Any cookie without SameSite attribute or with SameSite=None
|
βΉοΈ INFO
| Finding | Condition |
|---|---|
| Server header present |
Server header exists - display its value |
| X-Powered-By present |
X-Powered-By header exists - display its value |
| Technology fingerprint |
X-Generator, X-AspNet-Version, X-Drupal-Cache, etc. present |
| Rate limiting detected |
X-RateLimit-* or RateLimit-* headers present |
| Proxy detected |
Via or X-Forwarded-* headers present |
| CDN detected |
X-Cache, CF-Ray, X-Served-By, or similar headers present |
| Missing X-XSS-Protection |
X-XSS-Protection header not present (deprecated header) |
| Missing Permissions-Policy |
Permissions-Policy header not present |
| Missing Referrer-Policy |
Referrer-Policy header not present |
Why is server version INFO and not HIGH?
Detecting version numbers like
nginx/1.18.0vsnginxrequires regex pattern matching. We're keeping this exercise focused on HTTP fundamentals and string parsing. Just report what you find - security analysts know to look up CVEs for any versions shown.
Example Severity Classification
Target: https://example.com
π¨ CRITICAL (0):
(none)
β οΈ MEDIUM (5):
- Missing Content-Security-Policy header
- Missing Strict-Transport-Security header
- Cookie 'tracking' missing HttpOnly
- Cookie 'tracking' missing SameSite attribute
- Missing X-Frame-Options header
βΉοΈ INFO (4):
- Server: nginx/1.18.0
- X-Powered-By: PHP/7.4.3
- CDN detected: Cloudflare (CF-Ray header)
- Missing Permissions-Policy header
Grading Criteria
Your script will be evaluated on:
| Criteria | Points |
|---|---|
| Successfully fetches URL and handles errors | 10 |
| Extracts status code, message, response time | 10 |
| Reports server information headers | 15 |
| Checks all 7 security headers | 15 |
| Parses cookies with all attributes | 20 |
| Detects CORS misconfigurations | 15 |
| Generates clear, readable summary | 10 |
| Code quality and error handling | 5 |
| Total | 100 |
Bonus Challenges
Once your basic script works:
-
Add robots.txt fetching - Check for
/robots.txtand list disallowed paths - Check multiple URLs - Accept a file of URLs to scan
- Export to JSON - Save findings in structured format
- Compare to baseline - Load a "known good" config and diff against it
Submission
When complete, your deliverables should include:
-
web_recon.py- Your Python script - Sample output from running against 2-3 real websites
- Brief notes on any interesting findings
π Reference Solution
Stuck? Want to compare your approach?
View the reference solution on GitHub β
β οΈ Try it yourself first! The learning happens in the struggle. Only check the solution after you've attempted the exercise or if you're completely stuck.
The reference solution scores 91/100 and demonstrates:
- File-based URL input with comment skipping
- Complete security header analysis (7/7 headers)
- Cookie parsing with HttpOnly, Secure, and SameSite extraction
- CORS misconfiguration detection
- Severity-categorized findings (CRITICAL/MEDIUM/INFO)
- Security Header Score and Cookie Security Score calculations
Usage
# Scan multiple URLs from a file
python3 web_recon.py practice_urls.txt
# Scan a single URL directly
python3 web_recon.py https://example.com
File format (practice_urls.txt):
# Lines starting with # are comments
https://github.com
https://example.com
https://stripe.com
Note: The reference solution accepts a file of URLs. You can implement yours to accept either a file or a single URL directly.
Resources
Primary Sources
- Grace Nolan's Notes: https://github.com/gracenolan/Notes
- Hacking APIs (Corey Ball) - Chapter 6: Discovery
- Full Stack Python Security - Chapters 7 and 14
- API Security in Action - Chapter 5
Reference Documentation
Tools to Compare Against
- SecurityHeaders.com - Online header scanner
- Mozilla Observatory - Comprehensive scanner
Why This Matters
This exercise builds skills directly applicable to:
- Bug Bounty Hunting: Reconnaissance is the first step in finding vulnerabilities
- Penetration Testing: Discovery phase of every engagement
- Security Auditing: Automated header/cookie policy checking
- Security Engineering Interviews: Grace Nolan specifically lists this as a coding challenge
As noted in Hacking APIs:
"Taking notes is crucial to performing an effective attack. Document and take screen captures of all interesting findings."
Your script automates this documentation process.
π Found This Helpful?
This exercise is part of my 48-Week Security Engineering Curriculum - a complete roadmap from networking fundamentals to landing a Security Engineering role.
β Star the repo on GitHub to:
- Support the project
- Get notified when new exercises drop
- Help others discover these resources
The curriculum includes:
- π Weekly study guides with page-by-page reading assignments
- π οΈ Hands-on scripting exercises (like this one!)
- π― PortSwigger lab progressions
- π Interview prep from Grace Nolan's notes
Share Your Results!
Built your recon tool? I'd love to see it!
- Compare with the reference solution - See how your approach differs
- Tweet your output with #SecurityEngineering and tag me
- Open a PR to add your solution to the community solutions folder
- Post interesting findings (from authorized testing only!) in the discussions
Week 4 of the 48-Week Security Engineering Curriculum: Linux Security + Python Files
Top comments (0)