As data platforms evolve from simply “getting jobs to run” to achieving stable and reliable operations, the challenges teams face also begin to shift. Early on, the focus is mainly on whether tasks execute successfully. As scale increases, the concerns move toward access control, clarity of data pipelines, manageability of changes, and the ability to recover from failures.
This is where DataOps starts to show its real value. It is not just a set of tool usage guidelines, but an engineering methodology that spans development, scheduling, and governance. Using WhaleStudio’s development management framework as an example, this article distills a set of practical standards drawn directly from real production experience.
The Three Layer Development Framework
In complex data platforms, managing everything through a single dimension quickly becomes insufficient as the system grows. WhaleStudio introduces a three-layer structure of Project, Workflow, and Task, which decouples governance, orchestration, and execution, creating clear boundaries for system management.
Project as the Governance Boundary
The project layer is the most fundamental part of the system, yet it is also the most commonly misused. In many teams, projects are treated merely as a way to organize directories. This approach often leads to problems later, such as unclear permissions, resource misuse, and ambiguous ownership.
In a well-designed system, projects should serve as governance boundaries. Everything related to access control should be scoped within a project, including user permissions, data source access, script resources, alerting strategies, and Worker group configurations.
A practical rule is simple. Whenever there is a scenario where certain users should not be able to view or modify specific resources, isolation must be enforced at the project level rather than relying on conventions or manual processes.
Workflow as the Business Pipeline
If projects define who can do what, workflows define how work is organized.
A workflow is essentially a DAG that represents dependencies between tasks. In a typical data pipeline, workflows connect data ingestion, SQL processing, script execution, and sub-process calls into a complete business flow.
Beyond orchestration, workflows also handle scheduling concerns such as dependency management, parallel and sequential execution strategies, retry mechanisms, and backfill logic. This means a workflow is not just a representation of execution logic, but also a key part of system stability design.
In practice, workflows should be treated as traceable and replayable pipelines rather than just collections of tasks.
Task as the Smallest Execution Unit
Under workflows, tasks represent the smallest unit of execution and have the most direct impact on system stability.
Common task types include SQL, Shell, Python, and data integration jobs. Despite their differences, they should follow consistent design principles such as traceability, retry capability, and recoverability.
In many production scenarios, issues do not originate from the scheduler itself, but from the tasks. For example, non-idempotent SQL logic, scripts without proper error handling, or strong dependencies on external systems can amplify risks during retries or backfills. Establishing standards at the task level is therefore critical to overall system reliability.
Once the responsibilities of the three layers are clearly defined, the next step is to manage permissions and design workflows effectively to prevent the system from becoming unmanageable as it scales.
Principles for Data Access and Workflow Design
As teams grow and business logic becomes more complex, access control and workflow design become key factors affecting both efficiency and stability. Without consistent standards, systems can quickly become chaotic.
Organize Projects by Business Domain
Projects should primarily be structured around business domains such as sales, risk control, or finance. This aligns naturally with organizational structure and helps clarify ownership.
When cross-team collaboration is required, resource sharing should be implemented through authorization mechanisms rather than placing everything into a single project. While the latter may seem convenient initially, it often leads to uncontrolled permissions over time.
Separate Responsibilities in Permission Design
Permissions should never default to giving everyone full access. Roles such as development, testing, operations, and auditing should be clearly separated, each with its own scope of authority.
This approach reduces the risk of accidental changes and helps standardize release processes, making system changes more controlled.
Balance Isolation and Reuse
Resource management must balance isolation with reuse. Data sources, scripts, resource pools, and Worker groups should be isolated by default to avoid unintended interference.
When reuse is necessary, it should be achieved through controlled authorization rather than duplicating configurations. This reduces maintenance overhead and avoids inconsistencies.
Resolve Permission Differences Through Projects
Whenever permission differences exist, they must be handled through project-level isolation. For example, if certain datasets should only be accessible to specific users, this must be enforced through system mechanisms rather than informal agreements.
Although this principle seems straightforward, it is often overlooked, leading to loss of control over the permission system.
Once the permission model is stable, workflow design becomes the key factor in maintainability.
Control Workflow Size
As the number of tasks grows, placing everything into a single workflow leads to rapidly increasing maintenance costs and higher risk during changes.
In practice, workflows should be split based on data layers or business domains, such as ODS, DWD, DWS, and ADS. The number of nodes within a workflow should remain within a manageable range to avoid excessive complexity.
Upgrade Governance When Complexity Increases
When the number of workflows grows too large or directory structures become unmanageable, relying on labels or folders is no longer sufficient. At this point, governance should be elevated to a higher level, such as introducing additional project segmentation.
This is not merely structural optimization, but an evolution of governance strategy.
Once design principles are clear, implementation should align with team size. There is no single solution that fits all teams.
Implementation Strategies for Different Team Sizes
DataOps does not have a universal solution. The right approach depends on team size and system complexity.
Large Teams with Layered Isolation
In large or complex data warehouse environments, multiple business domains, permission boundaries, and data pipelines coexist. In such cases, data warehouse layers such as ODS, DWD, DWS, and ADS should be mapped to different projects and workflows.
Dependencies across projects and workflows must be clearly defined. Impact analysis tools should be used for global governance to ensure changes do not introduce cascading failures.
Medium Sized Teams with Balanced Design
For medium-sized teams, the goal is to maintain stability while avoiding unnecessary complexity.
Projects should not be overly fragmented, and workflows should not be split excessively. Instead, different scheduling cycles such as daily and monthly jobs can be connected through well-defined dependencies.
The focus at this stage should be on unified scheduling strategies and resource pool management rather than introducing overly complex governance frameworks.
Small Teams with Fast Execution
For small teams or early-stage projects, the priority is to establish a working delivery pipeline.
A single workflow can be used to handle core business processes, supported by naming conventions, alerting mechanisms, and backfill strategies to ensure baseline quality. As complexity increases, the system can gradually evolve toward more fine-grained structures.
This approach keeps costs under control while avoiding overly heavy design in the early stages.
Conclusion
From Project to Workflow to Task, WhaleStudio’s three-layer model provides a clear division of responsibilities. Projects define governance boundaries, workflows manage business orchestration, and tasks handle execution.
With well-designed permission models and properly structured workflows, systems can remain stable and controllable even as complexity grows.
The essence of DataOps lies not in the tools themselves, but in building an engineering system that can evolve sustainably. Only when permissions, resources, and execution logic are governed under a unified framework can a data platform truly support long-term business growth.
Previous Articles
- (5)When Your Data Warehouse Breaks Down, It’s Probably a Naming Problem
-
(4)Why Your ADS Layer Always Goes Wild and How a Strong DWS Layer Fixes It
- (3) Key Design Principles for ODS/Detail Layer Implementation: Building the Data Ingestion Layer as a “Stable and Operable” Infrastructure
- (I) A Complete Guide to Building and Standardizing a Modern Lakehouse Architecture: An Overview of Data Warehouses and Data Lakes
Coming Next
Part 7 Scheduling design best practices

Top comments (0)