DEV Community: Lulu Cheng

One Off to One Data Platform: Design with Intent [Part 2]

Lulu Cheng — Sun, 08 Dec 2024 00:47:54 +0000

Many data platforms today are built bottom-up, starting with collecting data that "might be useful later" and gradually patching together solutions as needs arise. This approach inevitably leads to fragmented implementation, mounting cost and technical debt. Data system design requires specialized expertise across data modeling, distributed systems, security, and compliance. Most companies simply cannot afford dedicated data infrastructure teams in their early days and have to build and adapt their data systems as they grow.

However, the path to evolve existing systems can be quite challenging. Teams often have to choose between lengthy migrations while maintaining multiple duplicate systems, or a costly full system cutover. Netscape's decision to rewrite their browser engine in 1997 cost them their browser business and market dominance to Internet Explorer as they couldn't compete with Internet Explorer's rapid feature releases, eventually leading to their decline in market share. Many companies start with custom solutions and migrate to vendor platforms as they grow; however, even at the scale where companies can afford vendor platforms, they may still not fit their use cases and internal users must adapt to new workflows. Many companies end up building custom solutions on top of vendor platforms as they continue to scale. Internal infrastructure teams now have to maintain their original systems, operate vendor platforms, and support custom implementations on top of those platforms - while also training users across different tools and handling security and integrations across multiple systems. Due to lack of planning and organic progress as business scale, what started as a cheaper solution becomes significantly more costly and complex to operate.

Designing data platforms that can scale with business growth is more achievable today than before. Over the past decade, many organizations have established clear data usage patterns - product teams need user behavior data, marketing teams track campaign performance, finance teams monitor revenue metrics, and security teams analyze threat patterns. These common use cases are well-established in terms of what data they need and how quickly they need it. Rather than discovering requirements through costly migrations and retrofitting vendor solutions, it's possible to build a data platform that can sustainably scale in terms of cost and operational efficiency.

Designing Data Platforms

At its core, a data platform can be defined by two fundamental components: what data you need (data models) and how quickly you need it (latency requirements). Even with a loosely defined use case, understanding these two components allows us to systematically derive data collection mechanism and infrastructure needs.

Take fraud risk detection for example. Typically, fraud risk requires three data components: identity, transaction and case management.

Each data component can be mapped to infrastructure based on the latency needs. Identity and transaction verification require stream processing for real-time fraud detections, database processing handles ongoing monitoring and alerts, and data lakes to support longer-running tasks like pattern analysis and model training.

Data Models

A data model defines how data should be organized and standardized. It specifies a set of fields and their characteristics - the format, type, and rules for each field. The schemas enable data discovery, while definitions of individual fields determine governance policies and compliance requirements.

Well-defined data models enable standardized data collection and processing across the organization. Take user data as an example - marketing needs it for campaign tracking, customer service for case management, product teams for behavior analytics, and risk teams for fraud detection. Without a shared user data model, every team builds their own version of user profiles and tracking logic. Teams end up creating complicated integrations to resolve and reconcile user data between systems. A shared data model serving as a single source of truth simplifies data collection and implementation, while consistent standards make security and compliance much easier to manage.

Defining comprehensive data models is often difficult for individual teams as they typically focus on their immediate needs. Marketing teams focus on campaign-related fields, while risk teams focus on identity verification attributes. Without a holistic view of how the same data serves different functions, teams often create incomplete or narrowly-focused data models that require further processing and integrations between systems.

Time Requirements

Time requirements defines how quickly data needs to be processed and made available. Processing windows range from real-time (seconds) for immediate decisions, to near real-time (minutes) for monitoring, to batch processing (hours) for analytics and AI/ML applications. These time requirements map to specific infrastructure choices - streaming for real-time, databases for near real-time, and data lakes for batch processing.

Without a framework, product teams often build redundant infrastructure for similar needs - one team might use Kafka while another uses MSK for streaming, or one team might choose DynamoDB while another uses Cassandra for databases. This creates unnecessary complexity as teams maintain multiple systems with duplicate security controls and integrations.

By standardizing infrastructure components, product teams no longer need to deploy their own infrastructure, and platform teams can reduce operational overhead by maintaining fewer systems. This standardization also enables better security controls, streamlined integrations, simplified observability, and optimized costs.

Generic Data Platform

A data platform architecture framework enables us to systematically derive data collection specifications, infrastructure requirements, security controls, and integration capabilities. This directly addresses the complexity and cost spiral that many organizations face today. Rather than teams building separate systems that platform teams struggle to support, a consistent framework simplifies security, compliance, integration, and cost management across the organization.

Without consistent implementation, platform teams are constantly asked to choose between maintaining existing systems, performing costly migrations, or building new features. Platform teams end up spending most of their time maintaining disparate systems and handling migrations instead of delivering business critical capabilities. A framework-driven approach enables organizations to scale their data platforms without disruptive migrations. Smaller organizations can start with necessary components and expand as they grow, and larger organizations can standardize their existing systems once without constant rewrite.

Coming Up Next

In Part 3 of the "One Off to One Data Platform" series, we'll discuss how this framework can be implemented on a practical level. We'll look into how common data platform components such as streaming, databases, data warehouse, and data lake can be assembled to support different business use cases with consistent security and compliance controls. As organizations grow, this modular approach allows teams to scale individual components independently while maintaining standardized interfaces and controls, eliminating the need for constant migrations. With a clear data platform architecture framework, organizations can build data applications that grow with their business rather than limiting it.

One Off to One Data Platform: The Unscalable Data Platform [Part 1]

Lulu Cheng — Thu, 14 Nov 2024 03:23:20 +0000

Today, we have access to highly scalable data tools handling massive volumes that would have been unthinkable just a few years ago. LLMs process billions of parameters, streaming platforms handle millions of events per second, and data warehouses that can easily scale to petabytes. Ironically, while individual tools are more scalable than ever, organizations find themselves struggling with increasingly unscalable data platforms.

Complexity and Cost Spiral

The data platform landscape has rapidly changed in recent years. Teams went from managing open source tools, to cloud vendors offering wrapped solutions, to enterprise-scale data platforms like Snowflake and Databricks that handle ingestion, storage, processing, and analytics. These enterprise data platforms, while powerful, come with significant operation complexity and cost for the organizations.

Today, industry's primary response to the growing complexity and cost of data management is data mesh. Essentially, giving individual teams the autonomy to build their own data systems and create additional automation and integrations to connect these separate systems.

Looking at the general trend, most organizations don’t really struggle from not being able to build data systems fast enough, instead, platform teams aren’t able to sustainably support these separate systems and integrate them. Worse, inconsistent implementation is a management nightmare for security and compliance. These growing complexity, security, and compliance requirements are all major drivers for the growing cost spiral.

Misaligned Teams

Teams working with data are fundamentally misaligned - app developers are responsible for data collection but their primary focus is customer facing features, data scientists need good data but get burdened with provisioning infrastructure, and platform teams need to optimize costs but can't control how data flows. Today, many vendor solutions try to solve the misalignment problem with data lineage, catalog, quality checks automation. However, these tools focus on analyzing the problem, not solving the problem.

Patch doesn’t Scale

For most large systems, it's almost always easier and cheaper if you know what you're trying to build ahead of time. Data systems are no exception. Yet most teams approach data publishing with the mindset of "we might need this later" instead of "how will this data be used?" This leads to teams spending more time finding and cleaning data than using it, creating unnecessarily complicated transformation processes, while security and compliance become exponentially more costly. The initial cost of building without clear intent might be lower. However, each new feature and maintenance cost compounds significantly over time.

15 to 20 years ago, when we first had the compute power to process massive amounts of data and businesses were just starting to learn about data's potential, it was understandable that both businesses and engineers didn't yet know how to build data systems. Today, we have many clear patterns and proven use cases across analytics, automation, and operations. Most core business data needs can and should be defined upfront.

Coming Up Next

In Part 2 of the "One Off to One Data Platform" series, we'll lay out a design framework for data platforms based on common business use cases. By defining data components and time-based requirements per use case, we can systematically derive what a production system needs - from compliance and security controls to infrastructure components to integration capabilities.

With a framework, teams can leverage existing platform components for new use cases instead of bootstrapping infrastructure from scratch each time. This approach not only reduces technical complexity, operational burden, and infrastructure costs, but enables organizations to quickly adapt to evolving business needs.

End the Never-ending Migrations: Platform Adoption Economics Explained

Lulu Cheng — Mon, 28 Oct 2024 04:37:21 +0000

Infrastructure and internal platform adoption don’t typically happen organically. In the best-case scenario, adoption happens through partnerships with product teams to enable feature development or provisioning automation. However, in many cases, particularly in the worst-case scenario, platforms are forcibly “adopted” by product teams through migrations.

To set the stage, migrations are bad. As a platform team, if you have to resort to migrations, especially involves application teams, you have failed. Migrations are not only tedious for both application and platform developers, providing little to no value-add while posing risks to major incidents. Moreover, migrations take up resources from both application and platform teams, pulling them away from core responsibilities and impactful work.

Platform Adoption Economics

Like any product, platforms are no exception - the easier it is to use and get started, and the more time and work it saves, the more likely developers will use it. Building a platform that makes developers' lives easier and gain organic adoption is usually the best strategy forward.

However, the moment platform teams need to use "sticks" instead of "carrots" to drive adoption, the cost of adoption starts to grow. The hidden costs, while hard to track explicitly, it's usually reflected in engineering time not doing engineering work, team morale, operational overhead, and the long development cycle to accommodate for forced solutions that don't fit.

Happy Path

In the ideal scenario, automated deployment and managed platforms streamline workflows, it's usually in the enablement category where developers can save time, focus on product features that matter to them, and improve developer velocity and overall "happiness".

Partnership

In the partnership model, application teams have to invest time working closely with platform engineers to co-create new features tailored to their needs. While this requires more commitment from application teams, it allows platform teams to better learn about the use case and lead to better implementations.

Top-Down Adoption

Engineering managers or VPs can mandate the use of specific technologies or platforms. Developers might not necessarily resonate with the decision. In many cases this is the necessary evil for certain design decisions that's hard to show immediate value to individual engineers but have long term value for the organization as a whole. For example, it's probably easier to agree as an organization whether to use mono or multi-repo. However, not every design decision is important enough to be mandated and it can run into the risks of not being thoroughly validated with every use case and make certain use case incredibly challenging to develop or maintain.

Implicit Features

In some cases, features like tagging or annotation are introduced without much context. Developers may use these features because they were asked to do so but often unsure of their significance or benefits. While these features might enable certain functionalities, the lack of clarity on intentions can lead to inconsistent and confusing implementations over time.

Migration and Deprecation

One of the more common—and perhaps the most frustrating—scenarios is platform migration or deprecation. Teams are often asked to move from one platform to another or to stop using certain functionalities without replacement or path forward. From application team’s perspective, there is often little to no value added, and meanwhile the migration process can increase operational burden and pose significant risks. Both platform and application teams find themselves pulled away from their core responsibilities to focus on migration work, which may not align with their developer's role or expertise.

Migrations are also costly for organizations, yet they are frequently normalized as “growing pains”. While there are often many good reasons for migration: security concerns, compliance requirements, long (5-10 years of) overdue technical debt, it's one of the most costly ways to gain adoption.

Platform Engineering's Existential Crisis

Typical product management focuses on creating features that people love and generate revenue for the company. Internal platforms have more nuanced goals. We are building features developers might or might not love while solving organizational problems like productivity, cost, security, and compliance. Success is measured by the absence of problems - reducing the number of incidents, compliance violations, security breaches, operational costs, and ensuring engineering teams can meet their delivery goals.

Let's say most people operate in good faith and they want to create great internal platforms that can drive organic adoption. While the definition of success can be a little nuanced in platform engineering's case, it's still relatively clear: reduce cost, improve developer velocity, and standardize implementations to reduce complexity around security and compliance. So why do most internal platforms suck and engineering teams, both platform and application teams, are stuck in never-ending migrations?

Most companies are not specialized in building developer tools or platforms - they have to staff platform teams because they've grown to a size where abstractions and standards become necessary to manage the complexity and meet the business cost, security and compliance needs. While many platform teams might think they are trying to create the best developer experience possible, the reality is, platform team's most important job is to translate these complex business requirements into guardrails so that developers can build software safely without fully understanding the underlying complexity.

Take PCI DSS (Payment Card Industry Data Security Standard) requirements for example. Application developers might have a rough idea that they shouldn't log credit card number without redaction or encryption, but most won't understand the exact compliance requirements or company policies. Platform team's job isn't to create the most secure or developer friendly platform possible, but to provide shared data handling implementations that help application developers build compliant software with reasonable overhead.

Case Study: Building a Data Platform

Every business has data. As organizations scale, managing data becomes inevitable. On one hand, data can enable faster and better decision making, reduce operational costs, and drive business growth; on the other, without proper management, it becomes a liability waiting for security breaches, compliance violations to occur. Data management, at its core, is highly technical and requires close coordination between engineering and business teams. Take building business applications for example - when building a hotel booking app, app developers are relatively clear about the features they need to build: search rooms, track reservations, manage availability. However, without coordination between engineering and business teams on data usage, organizations often end up investing enormous engineering resources to retroactively clean, protect, and transform data to derive even basic value from it.

Take user tracking data for example. Many companies start by collecting user activities in their applications thinking they'll need it for analytics. Without clear alignment between business and engineering on what insights are actually needed, companies end up with inconsistent event naming, missing user context, and scattered PII across raw events. Companies then have to allocate significant resources to cleaning data, detecting PII, stitching together user journeys, and building costly data transformation pipelines - just to get the basic insights that could have been easily obtained if teams set clear data requirements during application development phase.

In response, platform teams often over-engineer solutions: building complex data transformation pipelines, implementing PII detection across multiple systems, implementing metadata frameworks and data lineage tracking. It's not to say that the extra work is not valuable, but rather, it could have been solved by a much cheaper and simpler approach: defining clear data requirements.

Adoption is a Tool not a Goal

Platform engineering is not different from the day to day decision economics, is the benefit worth the adoption cost? Take data documentation for example, while comprehensive documentation is great on paper, how many engineers are actually successful in writing documentation for every system they've created? The reality is, unless the data has strict security or compliance requirements, there's very little value both to the company and the developers to document that data. Top-down mandate can simply be too costly in most cases and automated metadata tagging even if it's inaccurate could be just good enough.

Another example is data and cloud infrastructure migrations. One of the most common reasons is to consolidate and standardize deployment to reduce operational burden and maintenance work, strengthen security, reduce cloud cost. In these cases, the benefit often outweighs the adoption cost of top-down mandate and large scale migration - fragmented infrastructure leads to security vulnerabilities, operational inefficiencies, and rising costs that affect the entire organization. The cost of maintaining different deployment processes, security practices, and infrastructure patterns often justifies a multi-year migration project.

However, there's a fine line between necessary standardization and unnecessary migration. Almost every organization goes through cycles of tooling migrations - switching between different infrastructure management pattern. Platform teams swing between having teams write their own Terraform, using standardized modules, using configuration manifests only, or building fully managed platforms where teams write no infrastructure code at all. Each migration consumes enormous engineering resources from both platform and application teams while the actual value can be marginal.

These cycles happen because platform teams either try to support every possible use case or force everyone into a single pattern. While building massive platforms to handle every edge case can indeed gain greater adoption, take away 50% of platform team's supporting the 1% use case most likely wouldn't make business sense either.

Most companies aren't in the business of building developer platforms - they create platforms to enable their core business functions efficiently and securely. Platform teams should be conscious of their service boundary: if a system is too specialized for platforms to support, it makes more sense for product teams to build custom. If the right level of abstraction is yet to be proven, it could make more sense to automate as much as possible and minimize adoption costs by pushing for organic adoption. Alternatively, when standardization truly adds value - like reducing security risks or operational costs - the business value could outweigh cost of year long migrations.

Finding the right balance between platform abstraction and business value will ultimately determine whether your platform enables or hinders your organization's success.

Why Data Security is Broken and How to Fix it?

Lulu Cheng — Tue, 15 Oct 2024 17:43:45 +0000

Despite more and more money spent on data security, data breaches continue to occur. Security, platform, and data engineering teams often operate with differing priorities, leading to fragmented solutions that address symptoms rather than the root causes. While there are many exciting new products in developer tools in cloud and app security with varying levels of success, there are currently no similar tools to empower engineers specifically in data security.

Lessons Security Teams Can Learn from Platform Engineering

Security teams often face similar challenges as platform engineering. Both frequently propose changes that, while critical for long-term success, don't offer immediate value to developers and worse, lead to lengthy migrations. Platform engineering work like infrastructure consolidation or standardization, while helping developers move faster in large organizations, often lacks obvious benefits for individual teams.

To drive adoption for "unpopular" changes, platform teams commonly use "carrot and stick" approaches. In my career, I've learned that rolling out platform changes alongside features that make developers' lives easier is far more effective than mandates. For example, instead of repeatedly asking (or spamming) teams to annotate their applications with infrastructure tags, automatically creating integrations such as permission policies to access secrets based off the annotations can improve both adoption and the accuracy of the annotations much more effectively.

Security engineering shares many similarities with platform engineering. Platform engineering teams are accustomed to building self-serve developer tools and automation to reduce their support and ops burden. Today, many security teams focus on creating scans, dashboards, scores, and pinging teams to address vulnerabilities—an approach proven ineffective and overwhelming for application teams. Instead, identifying creative solutions that simplify developers' work while strengthening security implementations could drive much better adoption.

Fragmented Data Security

Despite cybersecurity insurance premium and spending on security solutions continue to increase, data breach remains to be problematic and continue to grow both in number of incidents and scale. Data security requires a combination of expertise from platform engineering, data engineering, and security. The reality is that most organization either can't afford to have in-house expertise for all three disciplines, even when they do, allocating resources to develop automated data security solutions is often not a business priority.

Today, data security implementation is shared across these three teams, often resulting in fragmented and ineffective solutions. Security teams define requirements for platform teams. Platform teams then interpret these requirements (often as simple checkboxes) for data teams. Data developers face inconsistent expectations; some resort to creating their own solutions, which aren't scalable due to lack of coordination and alignment across teams.

What Data Security Can Learn from App Security

Scanning and remediation aren't new concepts in software security. Take app security for example: tools like SonarQube and CVE scans have existed for more than a decade (yes, we didn't need AI for that). Companies like Snyk, Arcjet, and GitHub's Dependabot have taken a more developer-centric approach, creating easy-to-use tools that integrate security into the development process. This makes security a natural part of coding.

Data security tools like Amazon Macie, Google Cloud DLP, and platforms like Wiz, BigID, Varonis, and Imperva offer scanning capabilities. However, they often lack automated or retroactive remediation, with existing options typically tied to the platform and inaccessible to most developers.

While we know there's an issue and how to fix it, if the work isn't automated, vulnerability remediation hardly makes it to the prioritization board alongside feature work, tech debt, and KLOs. This narrative can change if we give platform engineers the tools and automation they need to address security issues efficiently and proactively.

Developer Centric Data Security Tools

To effectively address data security issues, developers need tools that not only identify problems but also empower developers to solve them through automation. This allow developers to take a more proactive approach in there development cycle.

Today, many data security platforms run scans against data lakes and infrastructure and flag unencrypted sensitive data stored on S3 bucket. The security analyst then tag the platform engineer to delete the file. The flag keeps coming back as existing data pipelines continue to write sensitive information to the bucket and the real fix is on data engineering teams to properly implement encryption. This often becomes a game of “whack-a-mole” fixes, yet the root remains unaddressed.

We've created Keyper to take a new approach to address the gap:

Security Aware Data Metadata
Data schema formats such as Avro and Json currently lack built-in support for data sensitivity or security-aware metadata. Additionally, common formats like Parquet and Iceberg, while efficient for storing large datasets, don’t natively include security-aware metadata. At Jarrid, we are exploring various metadata formats to incorporate data sensitivity and security-aware attributes that can be easily tracked at the data lake, file, and even down to the field level.
By tagging security-aware metadata directly into the data schema, tracking encryption keys and enforcing security policies across data pipelines and storage becomes easier. Currently, Keyper makes encryption key permissions configuration-driven and trackable, ensuring that encryption keys and their access rules are well-managed, version-controlled, and easy to audit.
Generic Encryption Library
Keyper is a generic object-level and field-level encryption library that can be integrated at various points in the data lifecycle. The library can encrypt and decrypt data across different platforms and use cases such as streaming, batch processing and ci/cd pipelines. With integrations to data platforms such as Kafka, Spark and ci/cd pipelines such Github Actions, developers can implement encryption and decryption as part of the development process.

Security as Enabler

Enablement is more effective than limitation, and security is no exception. The barrier from reactive to proactive data security is steep, and without the right developer tools and automation, there’s no end in sight to data breaches. The most effective way to make this transition is by giving developers the tools and automation they need to solve security problems as part of their existing workflows, instead of as an afterthought.

Most companies today are overwhelmed by the number of alerts, processes, and compliance requirements they must meet. Innovation from existing engineering teams can be challenging. However, at Jarrid, we continue to experiment with data security tools that enable data, platform, and security teams to move faster together while staying secure. By automating remediation and embedding security into the development process, we can simplify compliance and help organizations proactively address data vulnerabilities.

Summary

As we continue to experiment with developer tools that can both mitigate risks and increase developer velocity, we'd love to learn about the data security challenges your developers face and how can continue to build Keyper to help organizations transition into proactive data security.

Platform Engineering Abstraction: How to Scale IaC for Enterprise

Lulu Cheng — Thu, 03 Oct 2024 06:31:19 +0000

Infrastructure as code (IaC) has become increasingly challenging as organization and usage scale. Inconsistent configurations, security gaps, increasing costs remain to be hard to manage without the right level of abstraction, but what's the right level of abstraction?

From DevOps to Infrastructure as Code

DevOps started with engineers clicking around the AWS console and writing scripts to automate tasks like creating EC2 instances and setting up VPCs, but as usage grew, Infrastructure as Code (IaC) tools such as terraformand CloudFormation emerged. Instead of writing shell scripts, infrastructure engineers can write declarative code to define how the cloud resources should be deployed. This is a great step up in terms of abstraction from the one-off automation scripts. On one hand, the declarative nature of the IaC tools makes it easy enough for most developers to write (you can think of it as configuration instead of code); on the other, it's not abstracted enough to give platform engineers the ability to write modularized code for developers to create infrastructure across the organization.

Take resource tagging for example, for both cost tracking and governance purposes, it's a very common use case to want to tag cloud resources by application, team, business unit, cost center etc. With terraform, application engineers need to manually specify these tags for each resource. Even if platform engineering teams create the relevant terraform "tag" module to enforce common fields, application engineers still need to provide the tag values every time. The declarative nature of IaC makes it hard for platform engineering teams to automatically populate these tag values from a central registry of applications and teams. This becomes especially problematic as the number of resources and size of organization grows.

This led to the next question: how can we create a higher level of abstraction on top of IaC that addresses these enterprise-scale challenges?

Creating Platform Abstraction for Scale

Creating the right level of platform abstraction enables large organization to manage infrastructure at scale more effectively and increase developer velocity. Platform engineering teams often use a mix of strategies, depending on use cases, resources, and expertise within the organization.

1. Terraform Modules or Templates

Creating reusable terraform modules or templates is a common and easy way to create higher-level abstractions. While flexible, it's a lower-level abstraction, making organization-wide changes can be time consuming as they require review and testing by individual application team. For organizations lacking resources and expertise, vendors like Resourcely create templated terraform code with enforced practice and security.

2. Imperative IaC

With declarative IaC, integrating external processes into infrastructure deployment often requires workarounds. Platform teams frequently combine declarative IaC with one-off scripts or API calls in their CI/CD pipelines. Tools like AWS CDK (Cloud Development Kit), CDK for Terraform, and Pulumi allow teams to generate IaC using imperative programming languages, enabling platform engineers to create infrastructure code alongside dynamic processes. While this provides greater flexibility and abstraction, it poses a higher adoption barrier for application teams less familiar with these "CDK" frameworks.

3. Infrastructure and Platform as a Service

At the highest abstraction level, platforms like Databricks, Confluent, and Elastic wrap entire infrastructure deployments, exposing only relevant configurations to developers. This allows provisioning of complex, distributed systems with minimal effort. Kubernetes similarly abstracts compute resource management, enabling developers to create deployments using high-level configurations.

While the high level abstraction allows platform teams to standardize provisioning, governance, and security at scale for supported use cases, it comes at the cost of reduced flexibility and potential vendor lock-in. New or custom use cases can require much longer implementation cycles.

Platform Abstraction: Balancing Flexibility and Control

Defining the "right" level of abstraction is never easy. Too little abstraction can lead to frequent manual updates and migrations, inconsistent standards, security and governance implementations. However, too much abstraction can force teams to adapt their use cases to fit the platform's limitations, slowing down developer velocity.

The ideal abstraction varies case by case. For real-time fraud detection or experimental features, teams often need flexibility. Templated terraform might provide sufficient guardrails while enabling developers to easily create custom deployments.

In contrast, common use cases like data streaming involving Kafka, connectors, and multi-platform integrations typically require more abstraction. Platform teams often create end-to-end deployments, exposing only necessary configs. However, data pipelines can be nuanced, platform teams often expose lower-level components, allowing application teams to create custom implementation easily and bring back these features to the platform as they mature.

For less complicated scenarios like application deployment, most teams only need to specify a Docker image, CPU, and memory requirements. Platform teams can wrap the CI/CD process without exposing lower-level components.

Knowing the different implementation strategies and levels of abstraction for various scenarios, how can we build platforms that scale?

Future of Platform Engineering

Data and AI Platform as a Service

Kubernetes has simplified infrastructure and application deployment, but data platforms have become increasingly complex. The proliferation of specialized data tools, compounded by AI and ML solutions, has made standardization, integration, security, and governance much harder to manage than infrastructure or application deployment.

Vendors like Confluent, Snowflake, Databricks, and dbt are improving the developer experience with more automation and integrations, but they often operate independently. This fragmentation makes standardizing multi-directional integrations across identity and access management, data governance, security, and cost control even more challenging. Developing a standardized, secure, and scalable solution for multi-platform environments is now a fast evolving area for platform engineering teams.

Higher Abstraction for Standardization, Security and Governance

While application developers can use low-level terraform to create infrastructure on their own, this approach doesn’t scale at the enterprise level. Standardization, security, governance, and cost efficiency aren't just “nice to have” but "required" for large organizations. Many companies undertake multi-year, large-scale migrations due to a lack of abstraction where it’s needed, meanwhile slowing down developer velocity in areas where abstraction isn’t necessary.

Developers are looking for a combination of flexibility and simplicity depending on the features they are trying to build. Platform teams must continue to simplify the developer experience while strategically exposing lower-level components where necessary. By exposing the right components, they provide developers with greater flexibility and reduce the time needed to integrate new features back into the platform.

Imperative IaC for Platform Teams

To enable platform teams to build for higher abstraction and flexibility, offload the complexity from CI/CD pipelines, which are designed for orchestration rather than abstraction. Currently, platform teams typically use either configuration-driven systems or declarative IaC, but both approaches have limitations when it comes to creating sophisticated, dynamic deployments and abstraction.

The gap in tooling continues to be a challenge, especially for data and AI platform engineering. While there's a proliferation of data product vendors, few focus on simplifying multi-platform deployments. Instead, there's a surge of governance, lineage, and cataloging products to compensate for the lack of integrations. These solutions aim to solve the complexity that arises from using multiple platforms without addressing the core issue of creating streamlined and flexible deployments across data and AI products and platforms.

While infrastructure development tools and frameworks is a highly competitive space, there's still room for tools that can provide the advantage of imperative programming for complex logic, while preserving the audit trails, version control of declarative IaC for governance and compliance.

Summary

The value of platform engineering is often less tangible than customer-facing product features. While platform improvements may not directly contribute top-line growth, they are one of the most effective ways to manage cloud costs, which is often one of the biggest drivers of bottom-line costs. Choosing the right toolings and abstraction levels can not only improves developer experience and productivity but also strengthen security, governance, and compliance across the organization.

Streamline Keyper CI/CD Pipeline with Keyper's Github Action

Lulu Cheng — Sun, 29 Sep 2024 02:48:00 +0000

We're excited to introduce the new Keyper Github Action. You can now automate keyper deploy plan and keyper deploy apply commands in your CI/CD pipeline using Github Action. Keyper Github Action enables you to deploy data security workflows with only configurations, reducing friction between technical and non-technical teams, and ensuring consistent security policy implementation across your organization.

➡️ Check out our tutorial and try it out now

Demo

What is Keyper?

Keyper by Jarrid is a suite of data security tools designed to simplify role, key, and permission policy creation, management, and deployment. Keyper is fully integrated with popular cloud KMS services such as AWS KMS and GCP KMS, and is easy to incorporate into any existing tech stack and CI/CD workflows. With Keyper's Github Action, data security policy deployment can be fully automated alongside your existing code and infrastructure.

Automated Data Security with Keyper CI/CD

Continuous Integration and Continuous Deployment (CI/CD) automates code and infrastructure deployment. By adding Keyper into your CI/CD pipeline, teams can manage data security cloud resources and policies with configurations and no code required.

Introducing the Keyper Github Action

Keyper Github Action automates role, key, and permission policy deployment in your CI/CD pipeline. It runs terraform plan to validate configurations for new data security resources and policies, and automatically creates or updates them using terraform apply after changes or pull requests are merged into the main branch. By integrating security management into the CI/CD process, Keyper ensures that security configurations are consistently enforced across your organization and infrastructure.

With fully configuration-driven data security management, both technical and non-technical teams can easily create and enforce standardized security policies without additional operational overhead.

How to Use the Keyper Github Action

Keyper GitHub Action automates roles, keys, and permission policies in your CI/CD pipeline.

➡️ Check out the full tutorial and try it out now

Setup Keyper Github Action

Create the Github Action YAML file at .github/workflows/keyper-cicd.yml in your repository and add Keyper to the steps.

```yml
name: Keyper Action (Deploy Plan/Apply)

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  keyper-action:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Keyper Action (Deploy Plan)
        id: keyper-plan
        uses: jarrid-xyz/keyper@v0.0.4
        with:
          args: deploy plan
      - name: Run Keyper Action (Deploy Apply)
        id: keyper-apply
        uses: jarrid-xyz/keyper@v0.0.4
        with:
          args: deploy apply
        if: github.ref == 'refs/heads/main' # Only run if merged to main
```

Create Keyper Resources

In your repo's directory, create Keyper deployment, role, and key with Keyper CLI.

keyper create -t deployment # create deployment
keyper create -t role -n role-1 # create role
keyper create -t key # create key

This will generate configuration files similar to the example in the keyper-tutorial

configs
└── fb94659d-ce39-45a8-a2d7-112b4104cf43
    ├── deployment.json
    ├── key
    │   └── 8fc8518c-6691-4294-83ed-9dd9e46e5722.json
    └── role
        └── c90177bc-054c-42f4-89a0-3839b1f0b8f8.json

Create and Merge the Change

Commit and push the changes to your remote repository:
```
git add configs
git commit -m "<commit message>"
git push
```
Create a PR, and the Keyper Github Action will be triggered automatically to run terraform plan on the PR and validate the configurations.
Check Deployment Status

If terraform plan looks good, merge the PR to main. The Keyper Github Action will be triggered automatically again to run terraform apply. This will create and deploy the role and key to the cloud.

Summary

At Jarrid, we believe security is an enabler for developers. By creating security-aware developer tools, we empower engineering teams to build applications faster and more securely. Keyper automates security processes, allowing developers to focus on product features without compromising security. With a configuration-based interface, both technical and non-technical teams can collaborate to develop organization-wide security standards, minimizing miscommunication and reducing operational overhead.

With Keyper, security is embedded into the development toolings and process. Engineering teams can build secure software without compromising on simplicity and velocity. By injecting security tools into the development process, organizations can have a flexible yet robust way to evolve their security practices constantly without migrations or disruptions.

End-to-End AWS KMS Encryption and Decryption Tutorial

Lulu Cheng — Tue, 10 Sep 2024 04:18:28 +0000

We're excited to share our new tutorial on Keyper. Keyper v0.0.3 now supports AWS (in addition to GCP) for end-to-end data and file encryption and decryption. Whether you're a data engineer, platform engineer, or security analyst, this guide will help you securely manage encryption keys and protect sensitive data in your AWS cloud environment using AWS IAM and KMS in three simple commands.

➡️ Go to the Keyper AWS tutorial now

Why Use Keyper and AWS KMS for Data Security?

Data security is increasingly important, and encryption is one of the most effective ways to defend against unauthorized access. Keyper streamlines AWS IAM role and KMS key management by automating the role and key creation and key rotation, simplifying permission management, and providing a clear, developer-friendly interface. Keyper reduces the complexity of securing sensitive data, enabling engineers to focus on their core tasks while managing encryption and decryption operations with just a few simple commands.

What You'll Learn

In this tutorial, you’ll walk through how to:

Set up AWS IAM roles and KMS keys for encryption and decryption using the AWS CLI.
Manage encryption keys using Keyper via Terraform.
Encrypt a vulnerable file stored in S3 and ensure it’s protected against unauthorized access.

The guide is designed to be straightforward and practical, helping you implement encryption in your AWS cloud environment with ease, using AWS KMS for enhanced security without added complexity.

AWS KMS Encryption: A Critical Part of Data Security

As organizations handle increasingly sensitive data, encryption becomes a key defense mechanism. Traditional access controls can prevent unauthorized users from accessing data, but encryption ensures that even if access controls fail, the data itself remains secure. Read more in Data Security Strategy Beyond Access Control: Data Encryption.

Keyper integrates AWS IAM roles and KMS key creation and management via Terraform. It can be easily integrated into existing CI/CD pipelines, data, and tech stacks. This allows you to protect data beyond just at-rest and in-transit encryption, mitigating vulnerabilities and ensuring compliance with data privacy regulations like GDPR and HIPAA using just a few simple commands and configurations.

➡️ Go to the AWS KMS encryption tutorial now

Getting Started with Keyper and AWS KMS

➡️ Get started with Keyper and AWS KMS

Whether you're responding to potential vulnerabilities or proactively securing your data, Keyper allows you to encrypt and decrypt data via AWS KMS with just three simple commands, making it easy to implement effective security practices.

As always, we’d love to hear your thoughts on the tutorial and how we can make it better. Reach out or join the conversation in our community.

➡️ Give us feedback on the tutorial
➡️ Reach out to the Jarrid team
➡️ Ask questions on our discussion board

Secure Data Stack: Navigating Adoption Challenges of Data Encryption

Lulu Cheng — Tue, 03 Sep 2024 16:53:48 +0000

Encryption is one of the most effective data security strategies, alongside access control, software updates, and network segmentation. However, adding encryption to an existing tech stack can be challenging. Data pipelines aren’t typically designed to handle encryption and decryption, leading to additional migration work, operational complexity, and higher compute costs.

Data encryption presents its own challenges in modern times. While at-rest and in-transit encryption are common, they often fall short of preventing data breaches within the data store itself. Unlike typical encryption in TLS, where AES and RSA are effective for securing data during transmission, applying these methods to long-term data storage introduces unique challenges.

Currently, there is no clear strategy for maintaining long-lived encryption keys. Managing keys and performing key rotations can be error-prone and costly. The high computational cost of processing encrypted data further complicates maintaining secure encryption over time. This highlights the need for data-specific security strategies that go beyond just encryption in-transit or at-rest, ensuring robust protection throughout the entire data lifecycle.

Challenges to Better Data Security

Technical Complexity

Data pipelines are often built without the flexibility to handle encryption and decryption processes. For example, a pipeline designed to compute feature values from event data may need to be re-engineered to include encryption and decryption layers, adding cost and complexity to the implementation. Adding data encryption can also require updates to APIs, databases, and storage systems to support encrypted data formats. This involves updating the codebase, rewriting queries to handle encrypted fields, and ensuring that encryption keys are properly managed and accessible to services and compute environments.

Performance

Encrypting and decrypting data in real-time introduces higher latency. In high-throughput use cases, as large volumes of data are processed, the extra time needed to encrypt or decrypt each piece can slow down data processing. Encryption and decryption involve mathematical operations that require additional computational resources. When handling large datasets, this can be both costly and time-consuming, potentially preventing organizations from adopting encryption, even though it's a better data security practice.

Key Management

Currently, there’s no unified key management system specifically for data encryption. Common use cases like TLS are designed for short-lived keys, and long-lived key management lacks standardized solutions. Like any security component, visibility into key management is important -— understanding when and how a key was created, when and where it has been accessed, and how long it’s been in use before being rotated. Without this visibility, maintaining secure and effective data encryption practices can be risky and error-prone. The complexity increases when different teams within a company, or across partner organizations, need to securely share or manage keys. Gaps in key management practices can lead to security breaches, misconfigurations, and unauthorized access.

Cost

Adding encryption to an existing tech stack can involve non-trivial development work. The additional encryption and decryption processes require more computational resources, which can lead to higher operational and cloud infrastructure costs. The financial impact of a breach can be enormous, including fines for non-compliance with regulations, legal fees, and the costs associated with incident response and remediation. Like many security practices, the cost of data encryption must be balanced against the potential costs of a data breach—fines for non-compliance with regulations, legal fees, incident response and remediation expenses, and the reputational damage to the brand.

Expertise Gaps

Existing data security practices are increasingly insufficient to defend the growing frequency and scale of data breaches. As the technology involve, data and software engineers will also need to catch up on the corresponding skills and knowledge to implement data security effectively. This requires a commitment to continuous learning and adaptation by the companies and organizations.

Data Security Strategy: A Gradual Approach

Technology is ever-evolving, and so too is software implementation. However, a complete rewrite is not only costly but can also be detrimental to businesses. Instead, a less risky but more common approach is to identify high-impact, low-interruption service areas to roll out the change. By starting small, organizations can try out new technologies and processes, minimizing risks and disruptions to the business while gradually scaling up.

High Impact Areas

Field-Level Encryption

Field-level encryption targets specific sensitive information within a data object. Whole message or full file encryption, by contrast, encrypts entire data objects in one shot, making the entire content inaccessible without the proper decryption key. For real-time or data processing pipelines, field-level encryption allows the system to continue operating, enabling engineering teams to gradually update dependencies or downstream steps. In complex systems, a hard, full cutover can be challenging and risky. Field-level encryption lowers the adoption barrier by avoiding the need for an immediate, full migration, allowing for a more manageable and phased approach.

One advantage of starting with field-level encryption is that it opens the door to techniques such as data masking, redaction, deterministic encryption, and privacy-preserving encryption. These solutions can be implemented based on the specific security needs and the existing tech stack.

Whole File Encryption in Data Security Vulnerability Remediation

While field-level encryption is useful in more complex real-time systems and data pipelines, whole file encryption can be very effective for data security vulnerability remediation. Encrypting files containing sensitive information make it much easier for security teams to handle vulnerabilities on their own. Instead of having to consult with engineering or legal teams on whether potentially exposed data can be deleted, security teams can easily encrypt log files or data archives with sensitive information. This makes the remediation easier and also ensure that sensitive information is not only protected, reducing the risk of unauthorized access but also recoverable when needed.

Keyper by Jarrid

Whether you are a security team focused on data security vulnerability remediation or a data engineering team implementing field-level encryption, Keyper can help you reduce the complexity of your work. Keyper wraps access control, data encryption key management, and data encryption and decryption into three simple commands, so you can focus on your work while Keyper ensures that data encryption meets the highest security standards.

Summary

While data encryption is effective for protecting sensitive information, it can be costly to implement. Taking a gradual approach, starting with high-impact areas like field-level and whole file encryption, makes adoption more manageable. As technology and security threats evolve, adopting new security strategies is essential. With the right migration strategy, you can keep your tech stack up to date without excessive risks of disrupting your day-to-day business operations.

Data Security Strategy Beyond Access Control: Data Encryption

Lulu Cheng — Thu, 29 Aug 2024 17:01:37 +0000

The cost, frequency, and scale of data breaches continue to rise year over year. According to the IBM 2023 Cost of a Data Breach Report, the average cost reached $4.45 million in 2023. The number of breaches and the amount of exposed records have also increased, with over 22 billion records compromised in 2023 alone, as highlighted in the IBM Security Data Breach Report. So, what are companies doing to improve their data security?

Common Data Security Vulnerabilities

Misconfiguration

Cloud storage or databases are left publicly accessible due to simple mistakes. For example, AWS S3 bucket misconfigurations have led to numerous data breaches, exposing sensitive information without detection.

Weak Authentication

Weak passwords or the lack of multi-factor authentication (MFA) make it easier for attackers to gain access. After a recent breach, Snowflake implemented mandatory MFA to enhance security.

Unencrypted Data

Data that isn’t encrypted during transit or at rest is fully exposed and readable if accessed by unauthorized parties. In the healthcare sector, more than 385 million healthcare records
have been exposed in data breaches since 2009, many of which could have been prevented with proper data encryption.

Insider Threats

Misuse of privileged access can lead to severe breaches, as seen in the Capital One breach in 2019. A former employee exploited a vulnerability in Capital One’s cloud infrastructure, exposing the personal information of over 100 million customers.

Unpatched Software

Systems that aren’t regularly updated are vulnerable to attacks that exploit known security flaws. The WannaCry ransomware attack in 2017 is a good example of how regular software updates could greatly reduce the risk of data security incidents.

Mitigation Strategy for Data Security Vulnerabilities

Regular Audits and Monitoring

The most foundational step in mitigating data security vulnerabilities is conducting regular security audits and monitoring. Implementing tools to detect unusual activity, misconfigurations, or potential breaches in real-time enables teams to take quick action before issues escalate.

Data Deletion

Deleting sensitive data is often the quickest way to reduce risk when a vulnerability is detected or when data is no longer needed. Short-term, this approach is highly effective because it immediately removes the data from potential exposure. Long-term, it doesn’t address the underlying vulnerabilities and may only provide a temporary fix. Additionally, this strategy can be constrained by legal, compliance, and technical requirements.

Additional Access Control

Strengthening access controls is a common mid-term solution to prevent unauthorized access. This can include implementing policies like no public buckets by default in S3 or mandatory multi-factor authentication (MFA), as seen with Snowflake. Short-term, access control can be highly effective in enhancing security by limiting who can access sensitive data. Long-term, its effectiveness may diminish if not consistently maintained or updated. Inconsistencies in implementation across different platform vendors (like Snowflake, Databricks, and AWS) can create complexity, slow down development velocity, and sometimes reduce the effectiveness of access controls as a security measure.

Data Encryption

Encryption is a comprehensive long-term solution that protects data even if unauthorized access occurs. Short-term, encryption is highly effective because it immediately secures data, ensuring it remains unreadable without the correct keys. Long-term, encryption is the most effective approach because it provides ongoing protection as long as the keys are properly managed. Additionally, integrating encryption into your tech stack opens up the possibility of implementing advanced security features such as tokenization, data masking, and deterministic encryption, which further enhance data protection.

Comparison of These Three Approaches

Approach	Short-Term Effectiveness	Long-Term Effectiveness	Level of Effort	Challenges
Data Deletion	High—immediate risk reduction	Low—only a temporary fix, doesn't address root vulnerabilities	Minimal	Legal, compliance, and technical constraints
Additional Access Control	Moderate to High—effective in preventing unauthorized access	Medium—can degrade over time if not consistently maintained or updated	Can slow down development velocity	Inconsistent implementation across platforms; may require regular updates
Data Encryption	High—protects data even if accessed	High—ensures long-term data security; foundational for advanced features like masking and tokenization	Significant engineering effort	Requires re-architecting; technical complexity, but enables advanced security features

Advantages of Data Encryption

Encryption simplifies system architecture by reducing the need for complex access control, data governance, and extra security features on vendor platforms and developer tools. By protecting data at the core, organizations can spend less time and money on creating and dealing with extra approval processes and more on ensuring sensitive information is always private and secure.

Long-Term Simplification and Reliability

Data encryption encourages simpler architecture over time by reducing the extra security features needed across individual vendor platforms, internal processes, access control policies, governance, and monitoring. This not only enhances long-term data security but also makes it easier to keep implementation simple and maintain compliance across an organization’s infrastructure and overall security.

Keyper: Jarrid's Data Encryption SDK

Keyper is designed to make encryption and security management straightforward and accessible across your organization.

Configuration-Driven

Keyper comes with an easy-to-understand setup, allowing non-engineering functions—such as security, compliance, and legal teams—to collectively decide on access and encryption rules. Configurations can be applied by engineers directly using the SDK. This ensures security decisions are clear and aligned across all teams without additional overhead between engineering and security.

Easy Integration with Existing Tech Stacks

Keyper is designed to be easily integrated with existing CI/CD workflows and tech stacks at storage and consumption points. This allows organizations to enhance their data security without disrupting existing processes or requiring significant changes to their infrastructure.

Complementary to Existing Security Frameworks

By adding encryption to your existing security framework, Keyper improves the overall data security posture of your organization. It ensures that sensitive data remains secure even if other security measures are compromised, offering better protection and strengthening your existing access control and approval processes.

Summary

Data breaches are becoming more frequent and costly. While data deletion and access control offer short-term protection, they fall short in the long run. Encryption provides lasting security by simplifying system architecture and reducing reliance on complex controls. Jarrid’s Keyper make encryption simple using the existing tech stacks, ensuring data remains secure even if other measures fail.

Step by Step Guide to Remediate Data Vulnerability

Lulu Cheng — Fri, 09 Aug 2024 21:36:14 +0000

We're excited to share our new Keyper tutorial -- a step-by-step guide to remediating data vulnerabilities identified by popular scan tools such as Wiz and Dig with Keyper.

If you’re a data engineer, platform engineer, security analyst, this tutorial is for you. Whether you’re working with sensitive data in healthcare, finance, or any other area, knowing how to lock down that data with encryption is critical to address data vulnerabilities and keep your data safe and sound.

➡️ Go to whole file encryption tutorial now

Encryption as a Proactive Defense in Data Security

Sensitive data stored in cloud storage can be vulnerable to unauthorized access, even when traditional security measures are in place. Whole file encryption adds an additional layer of security by ensuring that even if unauthorized access occurs, the data remains unreadable without the decryption key.

Popular cloud security tools like Wiz and Dig often identify vulnerabilities in files stored in cloud storage. While these tools are effective at detection, they don’t provide the means to automatically secure the data. That’s where Keypercomes in, making it easy to encrypt these files and remediate the data vulnerability.

Encryption helps you to proactively secure your data at all times. By encrypting data, you ensure that even if vulnerabilities are discovered or security measures fail, the data remains secure and inaccessible to unauthorized users. This makes encryption a crucial part of the data security strategy.

What You’ll Learn

In the tutorial, you’ll learn how to:

Set up roles and keys for whole file encryption.
Deploy and manage encryption keys using Keyper via Terraform.
Encrypt a vulnerable file stored in Google Cloud Storage (GCS) and ensure it’s protected against unauthorized access.

The guide is designed to be straightforward and practical, helping you implement encryption in your cloud environment with minimal effort.

Getting Started

➡️ Go to whole file encryption tutorial now

Whether you're addressing vulnerabilities identified by security tools or proactively securing your data, Keyper provides a simple and effective solution. The tutorial focuses on using Keyper as we try to simplify as many steps as possible for you, you can totally decompose it and re-implement Terraform deployment on your own without using Keyper.

As always, we'd love to hear from you.

➡️ Let us know if you liked the tutorial and how can we make it better
➡️ Reach out to the Jarrid
➡️ Ask questions on our discussion board

How Data Encryption Can Simplify Infrastructure Architecture

Lulu Cheng — Wed, 31 Jul 2024 04:59:36 +0000

Product and infrastructure engineering teams are not always aligned with the interests of security engineering teams. While product and infrastructure focus on driving business value and delivering practical solutions, security focuses on detection, prevention, and remediation, which can seem less immediately valuable. Like an insurance policy, it's not entirely obvious why it's worth the money or effort when there hasn't been an incident yet. Instead of the traditional cycle of identifying vulnerabilities, applying remediation, and following up through case management, I've found it much more effective to advocate for security solutions that also deliver business value. For example, using OAuth and IAM-based access instead of static keys and encryption instead of more granular access control can significantly simplify infrastructure, reduce complexity, and lessen the operational burden, making them very appealing to both product and platform engineering teams.

An Example: Replace Static Keys with IAM-Based OAuth

Traditionally, access between systems is implemented via static key-secret pairs. While common, this method often leads to reliability issues due to the complexity of managing key generation, rotation, and application lifecycle. Platform teams must also invest significant effort in monitoring and detecting anomalies to prevent unexpected key-secret compromises, such as accidental exposure via Slack or GitHub. Even when developers report and remediate leaks, the rotation process can be laborious. Worse, developers may consider it a low-risk leak, and the leak can go unreported.

According to ISO/IEC 27001:2022, A.9.1

Organizations must implement policies and procedures to control access to information, ensuring it is only accessible to those with a legitimate need

Platform teams have two choices:

Add more complex access controls and approval processes.
Replace static key-secret pairs with IAM-based OAuth.

The first option can be tempting, as it involves simply adding a vendor like ServiceNow without much additional work. However, the second option, while requiring more implementation changes, is more secure and reduces the operational burden on application teams to update secrets, restart pods, and ensure secrets are picked up. In fact, several companies focusing on non-human identity authentication, such as P0 and Clutch, have recently emerged, highlighting the growing trend towards more secure and efficient authentication methods.

This example demonstrates how a different approach to security implementation can improve security standards, simplify infrastructure architecture, and enhance overall developer velocity.

The Case for Data Encryption

Data encryption is another example where, although security teams cannot simply "add a vendor" (we'd like to be that vendor one day), it significantly reduces complexity and implementation efforts across all platforms from both security and architecture design standpoints.

The typical data flow involves:

Source application publishes data
Data is sent to a transport layer (e.g., Kafka, Kinesis)
Data is stored in a database (MySQL, Postgres), data warehouse (Redshift, Snowflake), or data lake (S3, Databricks)

Different solutions have different interpretations and implementations of "access control," leading platform teams to implement their own versions. This often results in fragmented implementations across the company. For security engineers, the more fragmented the implementations are, the more difficult it is to implement standardized governance, control, and monitoring, ultimately making the system less secure.

Infrastructure/Vendor Auth and Permission Comparison

Platform	IAM-Based Auth	Row/Column Permissions
Databricks	Supports IAM-based permissions Link	Supports row and column-level security Link
Snowflake	Does not natively support IAM-based permissions Link	Supports row and column-level security
MySQL	Does not support IAM-based permissions	Supports row and column-level security through grants and policies Link
PostgreSQL	Does not support IAM-based permissions Link	Supports row and column-level security

With data encryption, access is configured once with a crypto key and can then be assigned to individual workloads at different stages of the data flow. This significantly reduces the complexities involved in implementing and aligning permission policies across different platforms. Encryption ensures that data is consistently protected across all platforms, simplifying governance and control while enhancing overall security.

Try Keyper Now

Let Keyper Up Your Data Encryption Game

Keyper by Jarrid empowers engineering teams to embed data encryption in any data handling process to streamline security implementation. Keyper offers a suite of crypto key management APIs designed to simplify key creation, management, deployment, and encryption/decryption. By integrating with cloud KMS services like AWS KMS and GCP KMS, Keyper enables managed crypto key generation, reducing infrastructure maintenance. This allows organizations to configure encryption once and apply it consistently across all stages of the data flow, ensuring robust security while simplifying governance and operational processes.

Questions? We are happy to chat

The Data Security Duo: Data Encryption and Vulnerability Scans

Lulu Cheng — Sun, 28 Jul 2024 20:12:50 +0000

With the rise of tools such as Wiz and Dig, data vulnerability scans have become more accessible than ever, allowing companies to quickly identify and address potential security issues. However, identifying vulnerabilities is just the first step—addressing them effectively requires robust solutions.

While scans can pinpoint weaknesses, encryption ultimately protects sensitive information. The combination empowers teams to effectively safeguard data, ensuring superior protection and a safer data infrastructure.

Vulnerability scans have long been a staple in software development. Developers have used tools like Sonar to ensure their builds don't have dependencies with reported CVEs (Common Vulnerabilities and Exposures). This process ensures developers update dependencies over time, so their software is not vulnerable to known issues. Over the years, this concept of vulnerability scanning has extended from code to infrastructure and, most recently, to data, and soon, AI/ML.

Early cloud adoption saw many instances of misconfigured S3 buckets accessible to the public, along with other misconfigurations in databases (Facebook database exposed, Elasticsearch server exposes personal data), , Kubernetes (Tesla hack via Kubernetes) and so on. While Infrastructure as Code (IaC) tools like Terraform have greatly simplified infrastructure deployment and management, ensuring secure configurations remains complex for many app developers. This complexity has led to centralized abstractions in many companies, where platform teams handle the intricate security configurations, allowing app developers to focus on building features without worrying about permissions, networking, secret rotations, encryption and so on.

However, abstraction is only possible when implementations are homogeneous or generic enough across application or product teams. Data infrastructure is often an exception due to the increasing number of use cases, vendors, implementation patterns, integrations, data sources, etc. AI/ML pipelines, while also complex, typically focus on the variation of training iterations and model and endpoint publishing, which is slightly more streamlined compared to the myriad data sources and processing involved in general data infrastructure. While platforms like Wiz and Dig enable data infrastructure and platform teams to gain visibility over exposed or vulnerable data storage, without proper tools and abstractions, both app developers and data infrastructure teams remain stuck in a reactive mode, unable to fundamentally address data security issues or identify undesirable implementations.

Data Vulnerability Scans: Knowing the Problem is Only Half the Battle

For package dependencies, addressing vulnerabilities has become more manageable with the increasing adoption of microservices. Fixing individual package dependencies in a microservice is much easier than in a monolithic codebase. Additionally, extensive unit and integration tests also help to reduce the risk associated with dependency upgrades.

In cloud infrastructure, addressing vulnerabilities is slightly more complex. App developers are often outside their comfort zone when working with infrastructure, but platform engineers are well-versed in this area. Centralized and abstracted cloud infrastructure tech stack and deployments give platform and infrastructure teams significant authority to enforce security best practices.

When it comes to data, the situation becomes more nuanced. Data usage patterns are highly custom, with diverse requirements ranging from real-time to batch processing and varying performance needs. This diversity makes it difficult for data infrastructure teams to implement generic, yet flexible, security and governance abstractions.

Three Approaches to Data Security Abstraction

When abstractions are challenging, simplifying basic security implementations becomes essential. This is where data encryption shines, providing a robust solution to data security that vulnerability scans alone cannot achieve. Here are the three main approaches to implementing data encryption:

Infrastructure Abstraction: Modern data infrastructure typically at the very minimal implements at-rest and in-transit encryption, along with permissions control, audit logs, and other security measures. However, this is often insufficient for data infrastructure, as data continues to be stored in its raw format in data lake storage like Snowflake or S3. If any security control fails at the data store level, such as excessive admin access permissions or internal threats, infrastructure security alone cannot address the issue. For example, Snowflake recently had a massive data breach due to lack of MFA.
Sidecar/Proxy Approach: Passing sensitive data through a proxy for additional encryption processing is another approach. While this can be effective, deploying sidecars or proxies can be challenging depending on the infrastructure setup. Additionally, data security often needs to be schema-aware, making it difficult for sidecar or proxy layers to handle without additional client-side implementation. Despite these challenges, this approach is framework and client-agnostic, making it easier to implement across diverse data ecosystems. Examples of such offerings include Conduktor.
Client-Side Implementation: This approach requires more effort for adoption but is the most flexible and simplest in implementation. By equipping developers with the tools they need for data encryption, they can take ownership of security implementation. Providing common tooling allows small platform teams to support a high number of application or product engineering teams. This is similar to giving application developers proper tools such as Sonar so that they can more effectively address security and vulnerability issues on their own.

Data Security Abstraction Comparison

Approach	Description	Pros	Cons
Infrastructure Abstraction	Encryption by default at infrastructure level.	Comprehensive default encryption, centralized control.	Raw data may still be vulnerable, not sufficient for all needs.
Sidecar/Proxy Approach	Passing data through a proxy for processing.	Framework and client-agnostic, easier to implement.	Challenging deployment, lacks schema awareness.
Client-Side Implementation	Developers use provided tools to encrypt data on the client side.	Developers take more ownership with less platform engineering support.	More upfront work, requires comprehensive toolsets.

Improving Data Security Through Encryption

At Jarrid, we aim to create data security standards that are easy to understand and implement for popular data handling frameworks. We believe creating better security tooling and embedding data security implementation into common languages and frameworks will have a long-lasting impact. For example, most API frameworks now include TLS middleware by default, making adoption much easier for most backend developers.

Our first step in this direction is Keyper, a suite of crypto key management APIs designed to simplify key creation, management, deployment, and encryption/decryption. Keyper integrates with cloud KMS services like AWS KMS and GCP KMS. We believe in making security simple and accessible, allowing developers to focus on building great products without compromising on data security.

Our roadmap includes adding advanced cryptographic capabilities and making these tools attractive to developers by reducing the friction for adoption. By incorporating the latest features in encryption, we enable companies to adopt the highest standards of data security while continuing to unlock the value in their data.

Try Keyper Now

Conclusion

In summary, while vulnerability scans are necessary, data encryption provides the right tools for infrastructure and data teams to address these vulnerabilities effectively. By making encryption accessible and easy to implement, we can fundamentally solve data security challenges and create a safer, more secure digital landscape.

Combining data encryption with vulnerability scans offers a comprehensive approach to data security. Scans identify potential weaknesses, while encryption ensures that sensitive information remains protected even if vulnerabilities are exploited. The combination of both approaches empowers teams to proactively safeguard data, enhancing overall security and fostering trust in digital systems.

Questions? We are happy to chat