Introduction
In Part 1, we explored the architectural problem space: why manual test operations break down in hybrid cloud-to-on-premises environments with multiple parallel testing environments.
The core problems:
- Data Architecture Gap: Spreadsheet-based management with no versioning or traceability
- Execution Architecture Gap: Manual clicking with no orchestration or CI/CD integration
- Consolidation Architecture Gap: 3-4 days of manual merging with arbitrary conflict resolution
The insight: Test operations need the same architectural rigor as code operations.
This article introduces the complete solution: a 6-phase test automation system architecture that transforms test data from disposable artifacts into cumulative organizational knowledge, and test operations from manual bottlenecks into automated accelerators.
Table of Contents
- Architecture Overview
- System Components
- The 6-Phase Lifecycle
- Phase 1: Feature Testing in Cloud Environments
- Phase 1.5: Release Tagging Validation
- The Transformation Delivered
- What's Coming Next
- About This Series
Architecture Overview
📄 Master diagram (PDF):
https://github.com/HariOm-Labs/.github/blob/main/assets/hariom-labs-hybrid-test-automation-architecture-master-diagram-v1.0.pdf
The complete system consists of 8 distinct layers spanning cloud development through on-premises production:
┌─────────────────────────────────────────────────────────────────┐
│ Layer 1: Actors & Triggers │
│ (Developers, QA, CI/CD) │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 2: User Interfaces │
│ [Test Data Studio] [Run Orchestrator] [Review UI] │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 3: Service Layer (REST APIs) │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 4: Orchestration & Automation │
│ [CI/CD] [Release Tagging] [Execution Orchestrator] │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 5: Cloud Layer │
│ [Dev] [ST-1] [ST-2] ... [ST-N] │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 6: Data Consolidation Zone │
│ [Consolidation Service] [Conflict Resolution] │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 7: On-Premises Layer │
│ [PreProd Staging] [PreProd Master] [Production] │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Layer 8: External Systems │
│ [Test Mgmt] [Defect Tracking] [Reporting] │
└─────────────────────────────────────────────────────────────────┘
Layer 1: Actors & Triggers
Who initiates workflows:
- Developers (code commits, branch merges)
- QA Engineers (test data creation, manual test runs)
- QA Leads (release tagging, consolidation approval)
- CI/CD Systems (automated triggers, scheduled runs)
Layer 2: User Interfaces
How users interact with the system:
- Test Data Studio: Web UI for structured test data creation
- Run Orchestrator Dashboard: Test execution management and monitoring
- Review Interface: Manual review for unresolved conflicts
Layer 3: Service Layer
RESTful APIs providing:
- CRUD operations for test data
- Execution management (trigger, schedule, monitor)
- Consolidation control and status
- Reporting and metrics
Layer 4: Orchestration & Automation
The intelligent automation layer:
- CI/CD Pipelines: Automated deployment and testing triggers
- Release Tagging Service: Validation and locking for consolidation
- Execution Orchestrator: Parallel test execution management
Layer 5: Cloud Layer
Fast iteration environments:
- Dev Environment: Developer smoke testing
- ST-1, ST-2, ... ST-N: Isolated system test environments (typically 8-15)
- Each with its own database containing environment-specific test data
Layer 6: Data Consolidation Zone
The intelligence layer that solves the 3-4 day problem:
- Data Consolidation Service: Multi-source merging with conflict detection
- Conflict Resolution Engine: Rule-based automatic resolution (70-80% success rate)
- Manual Review Queue: Human decision for complex conflicts
Layer 7: On-Premises Layer
Production-ready environments:
- PreProd Staging Database: Consolidated data for release testing
- PreProd Master Database: Cumulative test knowledge repository
- Production: Live system with production data
Layer 8: External Systems
Integration touchpoints:
- Test Management Tools (test case tracking)
- Defect Tracking Systems (bug management)
- Reporting Infrastructure (metrics, dashboards)
- Monitoring and Alerting (system health)
System Components
┌───────────────────────────────────────────────────┐ │ System Components │ ├───────────────────────────────────────────────────┤ │ │ │ ┌──────────────────────┐ ┌────────────────────┐ │ │ │ Test Data Studio │ ◄ │ QA Engineer │ │ │ └──────────────────────┘ └────────────────────┘ │ │ │ │ Creates test data with automatic tagging │ │ │ │ ┌───────────────────────────────────────────────┐ │ │ │ ST-1, ST-2, ... ST-N │ │ │ │ (Cloud Test Environments) │ │ │ └───────────────────────────────────────────────┘ │ │ │ │ Triggers │ │ │ │ ┌──────────────────────┐ ┌────────────────────┐ │ │ │ Run Orchestrator │ ◄ │ CI/CD Pipeline │ │ │ └──────────────────────┘ └────────────────────┘ │ │ │ │ Test Results │ │ │ │ ┌───────────────────────────────────────────────┐ │ │ │ Release Tagging Service (Phase 1.5 Gate) │ │ │ └───────────────────────────────────────────────┘ │ │ │ │ Tagged & Validated Data │ │ │ │ ┌───────────────────────────────────────────────┐ │ │ │ Data Consolidation Service │ │ │ └───────────────────────────────────────────────┘ │ │ │ │ Consolidated Data │ │ │ │ ┌───────────────────────────────────────────────┐ │ │ │ PreProd Master Database (Cumulative Knowledge) │ │ │ └───────────────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────────────┘
The architecture implements six core components:
1. Test Data Studio
Purpose: Structured test data creation with automatic metadata capture
Features:
- Web-based UI for creating test data (users, accounts, transactions, etc.)
- Schema-enforced data entry (validation at creation time)
- Automatic tagging on every create/update:
-
environment: Which ST environment (ST-1, ST-2, etc.) -
feature: Which feature being tested (FEAT-123) -
release: Which release (R2, R3, etc.) -
tagged_at: Timestamp -
tagged_by: Creator user ID
-
- Relationship management (foreign key enforcement)
- Data templates for common scenarios
- Bulk import/export capabilities
Problem Solved: Replaces spreadsheet chaos with structured, traceable data creation
2. Run Orchestrator
Purpose: Automated test execution engine
Features:
- Trigger test suites automatically (commit, schedule, on-demand)
- Parallel execution across multiple environments
- Support for multiple test frameworks
- Result aggregation and reporting
- Integration with CI/CD pipelines
- Execution history and trends
Problem Solved: Replaces 40-80 hours of manual clicking with automated orchestration
3. Release Tagging Service
Purpose: Phase 1.5 validation gate - the critical innovation
Features:
- Scans all ST databases when release branch is merged
- Identifies untagged or mis-tagged data
- Applies release tags automatically
- Generates comprehensive manifest of what will consolidate
- Locks data to prevent changes during consolidation
- Validation rules and error detection
Problem Solved: Prevents consolidation problems before they start; 30-60 minutes of validation prevents 4-8 hours of consolidation headaches
4. Data Consolidation Service
Purpose: Intelligent multi-source data merging
Features:
- Collects data from all ST environments
- Detects conflicts automatically
- Applies resolution rules (temporal priority, feature priority, etc.)
- Validates data integrity and relationships
- Generates audit trail of all decisions
- Manual review queue for unresolved conflicts
- Rollback capability
Problem Solved: Reduces 3-4 days of manual consolidation to 4-8 hours automated
5. Master Database
Purpose: Cumulative test knowledge repository
Features:
- Preserves all test data across releases
- Supports cumulative regression testing
- Partitioned by release for performance
- Archival and retention policies
- Historical analysis and reporting
Problem Solved: Transforms test data from disposable to cumulative; coverage grows instead of declines
6. CI/CD Integration Layer
Purpose: Seamless pipeline integration
Features:
- Webhook triggers from Git events
- Pipeline status feedback
- Automated test execution on commit/merge
- Deployment gating based on test results
- Metrics reporting to dashboards
Problem Solved: Integrates testing into automated pipeline; no manual gates
The 6-Phase Lifecycle
The architecture implements a structured lifecycle that transforms how test operations work:
| Phase | Duration | Location | Purpose |
|---|---|---|---|
| Phase 1 | 1-2 weeks | Cloud (ST envs) | Feature testing with auto-tagging |
| Phase 1.5 | 30-60 min | All ST databases | Validation gate & manifest generation |
| Phase 2 | 4-8 hours | Consolidation service | Intelligent data merging |
| Phase 3 | 1-2 days | On-Prem PreProd | Two-tier release testing |
| Phase 4 | ~1 day | On-Prem Production | Synchronized deployment |
| Phase 5 | Ongoing | On-Prem Production | Continuous cumulative regression |
Each phase builds upon the previous, creating a system where test data becomes an appreciating asset.
Phase 1: Feature Testing in Cloud Environments
┌───────────────────────────────────────────────────┐
│ PHASE 1: Feature Testing │
│ │
│ Developer → [Git Push] → [CI/CD] → [Dev Deploy] │
│ ↓ │
│ [Dev Environment] │
│ ↓ │
│ QA Engineer → [Test Data Studio] → Creates Data │
│ ↓ │
│ [Automatic Tagging] │
│ environment, feature, release │
│ ↓ │
│ [ST-1] [ST-2] ... [ST-N] │
│ (Isolated Test Environments) │
│ ↓ │
│ [Run Orchestrator] → Execute Tests │
│ ↓ │
│ [Feature Approved] │
│ │
└───────────────────────────────────────────────────┘
↓
┌───────────────────────────────────────────────────┐
│ PHASE 1.5: Release Tagging │
│ │
│ Trigger: [Merge to Release Branch] │
│ ↓ │
│ [Release Tagging Service Starts] │
│ ↓ │
│ Scan: [ST-1] [ST-2] [ST-3] ... [ST-N] │
│ ↓ │
│ [Detect Untagged/Mis-tagged Data] │
│ ↓ │
│ [Auto-Correction Applied] │
│ ↓ │
│ [Generate Manifest] │
│ (Complete list of what's tagged) │
│ ↓ │
│ [Lock Data] │
│ ↓ │
│ [Ready for Consolidation] │
│ │
└───────────────────────────────────────────────────┘
Overview
Duration: 1-2 weeks (per feature)
Location: Cloud environments (Dev and ST-1 through ST-N)
Objective: Validate features in isolated environments with automatically tagged test data
The Development Flow
Step 1: Code Deployment
Developer pushes code to feature branch (e.g., feature/FEAT-123):
- CI/CD pipeline triggers automatically
- Code deploys to Dev environment
- Developer runs basic smoke tests
- Unit and integration tests run
Step 2: Test Data Creation
QA Engineer uses Test Data Studio:
- Opens web UI
- Selects environment (ST-2)
- Creates test data for the feature:
- Test users with specific roles
- Test accounts with required balances
- Test transactions with various scenarios
- Test configurations and settings
- Data validated at entry time (schema enforcement)
- Relationships tracked automatically (foreign keys)
Step 3: Automatic Tagging
As test data is created, the system automatically applies metadata:
{
"user_id": "testuser123",
"email": "testuser123@company.com",
"role": "Admin",
"metadata": {
"environment": "ST-2",
"feature": "FEAT-123",
"release": "R2",
"tagged_at": "2024-01-15T10:30:00Z",
"tagged_by": "qa.engineer@company.com"
}
}
This automatic tagging is crucial - it captures context without manual effort. Six months later, you can trace this data back to:
- Which environment it was created in
- Which feature it was testing
- Which release it belongs to
- Who created it and when
Step 4: Environment Deployment
CI/CD deploys feature code to available ST environment:
- ST environments are isolated (ST-1, ST-2, ... ST-N)
- Each has its own database
- Multiple features tested in parallel
- No interference between features
Step 5: Automated Test Execution
Run Orchestrator triggers test suites:
- Tests execute automatically (no manual clicking)
- Run in parallel across available ST environments
- Results aggregate and report to test management system
- Pass/fail status updates in real-time
Step 6: Feature Approval
Once tests pass:
- Feature approved for merge into release branch
- Test data remains in ST database, tagged and ready
- No manual preparation needed
- Data automatically included in consolidation
Key Architectural Benefits
Environment Isolation
Each ST environment operates independently. Teams work in parallel without coordination overhead. Data conflicts impossible during feature development.
Automatic Context Capture
Every test data record carries its context. No manual documentation needed. Complete traceability built-in.
No Manual Coordination
Teams don't coordinate test data creation. System handles it automatically. Scales to 10+ teams easily.
Release Preparation Built-In
By end of Phase 1, all test data is:
- Properly tagged
- Validated
- Ready for consolidation
- Traceable to features
No last-minute scrambling. The foundation for consolidation is laid during feature development, not before release.
Phase 1.5: Release Tagging Validation
Overview
Duration: 30-60 minutes
Location: All ST databases
Trigger: Merge to release branch
Objective: Validate, correct, and lock all test data before consolidation
Why Phase 1.5 Exists
This is the critical architectural innovation that makes the entire system work.
Before Phase 1.5 existed, organizations faced:
- Test data missing release tags (race conditions, manual errors)
- Incorrectly tagged data (wrong environment, wrong feature)
- No clear manifest of what will consolidate
- Hours spent during consolidation fixing tagging issues
Phase 1.5 was introduced as a validation gate that catches and fixes these problems before consolidation begins.
Think of it as the "build verification" step for test data - just like you verify code compiles before deploying, you verify test data is properly tagged before consolidating.
The Release Tagging Process
Step 1: Trigger
When release branch is merged (e.g., release/R2):
- Release Tagging Service automatically triggers
- Connects to all ST databases (ST-1, ST-2, ... ST-N)
- Begins comprehensive scan
Step 2: Database Scanning
Service performs comprehensive scan:
- Query all test data created/modified since last release
- Identify records with missing tags
- Detect mis-tagged records
- Validate tag accuracy
Query example:
SELECT * FROM test_data
WHERE created_at > last_release_date
AND (release IS NULL OR environment != current_environment)
Step 3: Untagged Data Detection
Service flags records missing release tags due to:
- Manual data creation outside Test Data Studio
- System errors during tagging
- Legacy data created before tagging system existed
- Race conditions in high-concurrency scenarios
Example findings:
ST-1: 150 records found
- 145 properly tagged
- 5 missing release tag
ST-2: 200 records found
- 198 properly tagged
- 2 missing feature tag
Step 4: Mis-Tagging Correction
Service validates tag accuracy:
- Data tagged with
environment: "ST-1"actually in ST-1 database? ✓ - Feature tag matches active feature in release? ✓
- Release tag format correct? ✓
Auto-correction applied for common issues:
- Missing release tag → Apply current release tag
- Wrong environment tag → Correct to actual database location
- Invalid feature ID → Flag for manual review
Step 5: Release Tag Application
Once validation complete:
- Service applies release tag (e.g., "R2") to all validated records
- Operation is atomic (all-or-nothing across all databases)
- Ensures consistency across all environments
Step 6: Manifest Generation
Service generates comprehensive manifest:
Release: R2
Tagged at: 2024-01-15T14:30:00Z
Validation: PASSED
ST-1: 150 records
- FEAT-101: 45 records (20 users, 25 accounts)
- FEAT-102: 55 records (30 users, 15 accounts, 10 transactions)
- FEAT-103: 50 records (25 users, 25 accounts)
ST-2: 200 records
- FEAT-104: 80 records (40 users, 30 accounts, 10 transactions)
- FEAT-105: 70 records (35 users, 35 accounts)
- FEAT-106: 50 records (25 users, 15 accounts, 10 transactions)
ST-N: 75 records
- FEAT-107: 40 records (20 users, 20 accounts)
- FEAT-108: 35 records (15 users, 15 accounts, 5 transactions)
Total: 425 records across 8 environments
Issues Found: 7 (auto-corrected)
Ready for consolidation: YES
This manifest is invaluable:
- Shows exactly what will consolidate
- Identifies potential issues before they become problems
- Provides audit trail for compliance
- QA Lead can review before approving
Step 7: Data Locking
Once manifest generated and approved:
- Service locks all tagged data
- Prevents modifications during consolidation
- Ensures consistency between manifest and actual consolidation
- Unlock occurs after consolidation completes
Why Phase 1.5 is the Breakthrough
Time Investment vs. Return:
- 30-60 minutes of automated validation
- Prevents 4-8 hours of consolidation debugging
- ROI: 8-16x time savings
Error Prevention:
- Catches tagging errors before consolidation
- Auto-corrects 80-90% of issues
- Flags complex issues for human review
- Zero errors propagate to consolidation
Visibility:
- Complete manifest of what's consolidating
- No surprises during consolidation
- QA Lead can review and approve
- Management has visibility into release scope
Traceability:
- Every data record traced to feature and release
- Audit trail for compliance
- Historical analysis possible
- Can answer "why does this data exist?" six months later
Confidence:
- Team knows consolidation will work
- Manifest provides certainty
- Issues resolved proactively
- Reduces consolidation stress
Real-World Example
Without Phase 1.5:
Day 1 of consolidation:
09:00 - Start collecting data from ST environments
11:00 - Discover 15 records missing release tags
11:30 - Track down why they're untagged (manual creation)
13:00 - Manually add tags (guessing which release)
14:00 - Discover 8 records tagged with wrong environment
15:00 - Investigate and correct
16:00 - Find feature ID doesn't exist in tracking system
17:00 - Still debugging...
Result: Day 1 spent fixing tagging issues, not consolidating
With Phase 1.5:
Release branch merged at 14:00
14:05 - Release Tagging Service starts
14:15 - Scan complete, 7 issues found
14:20 - Auto-correction applied to 6 issues
14:25 - 1 issue flagged for manual review
14:30 - QA Lead reviews and approves
14:35 - Manifest generated, data locked
14:40 - Ready for consolidation
Result: 35 minutes, all issues resolved, ready to consolidate
The difference: Proactive validation vs. reactive debugging.
The Transformation Delivered
Let's compare the before and after:
Data Management
Before:
- Spreadsheets with no version control
- Manual data entry with no validation
- No traceability or context
- Conflicts discovered weeks late
After:
- Structured repository with enforced schemas
- Automatic tagging at creation time
- Complete traceability to features and releases
- Conflicts detected immediately
Test Execution
Before:
- Manual clicking through applications
- 40-80 person-hours per regression
- Sequential execution during business hours
- Inconsistent results across testers
After:
- Automated orchestration
- 4-8 hours parallel execution
- Can run overnight, continuously
- Reproducible results
Data Consolidation
Before:
- 3-4 days of manual merging
- Arbitrary conflict resolution
- 10-15% error rate
- Zero audit trail
After:
- 4-8 hours automated merging
- Rule-based resolution (70-80% automated)
- <1% error rate
- Complete audit trail
Test Coverage
Before:
- Declining over time (deletion to avoid conflicts)
- 500 tests → 300 tests over 2 years
After:
- Growing continuously and automatically
- 500 tests → 800+ tests over 2 years
Team Productivity
Before:
- QA spends 40-50% time on data management
- Senior engineers stuck on consolidation
- Testing is bottleneck
After:
- QA spends 10-15% time on data management
- Consolidation automated
- Testing accelerates releases
What's Coming Next
This article introduced the complete system architecture and detailed Phases 1 and 1.5 - the foundation that makes everything else possible.
In Part 3, we'll cover the remaining phases:
Phase 2: Data Consolidation
- Conflict detection algorithms
- Resolution rule engine (how 70-80% auto-resolve)
- Manual review process for complex conflicts
- Validation and integrity checks
- Cross-boundary transfer (cloud to on-prem)
Phase 3: Release Testing
- Two-tier testing strategy (Staging vs. Master)
- Incremental testing approach
- Automated regression on Master Database
- Test result analysis and decision gates
Phase 4: Production Deployment
- Synchronized code and data deployment
- Rollback capability (preserving data consistency)
- Production verification testing
- Monitoring and validation
Phase 5: Continuous Growth
- Nightly cumulative regression
- Coverage expansion without manual effort
- Performance optimization for growing datasets
- Archival and retention strategies
In Part 4, we'll cover:
- Real-world metrics and results
- Implementation strategy (10-17 month timeline)
- Known limitations and trade-offs
- When to use (and when NOT to use) this architecture
- Alternative approaches and patterns
About This Series
This architectural pattern is part of HariOm-Labs, an open-source initiative focused on solving real Cloud DevOps and Platform Engineering challenges with production-grade solutions.
The mission: Share technically rigorous, production-ready implementations with comprehensive documentation of trade-offs, architectural decisions, and real-world considerations. Not toy examples, but battle-tested patterns that teams can actually use.
GitHub: https://github.com/HariOm-Labs
Have you implemented similar orchestration patterns? What challenges did you face? Share your experiences in the comments!
Top comments (0)