Earlier in my career, our entire development team volunteered to help the reporting team validate some major updates with their service. None of us were excited about pausing our current priorities, but we all realized this was the right thing to do.
After I ran my first report, it appeared that the data connected to the application might be a copy of production data. I was able to quickly validate this assumption by running reports based on my own work-related expenses.
This discovery made me realize that a violation of consumer privacy had unexpectedly occurred, potentially paving the way for real data to be leaked outside the organization - now that it existed in an environment where access was more open and not closely monitored.
I knew the reporting team should not have made a copy of production data, but I also recall they were under a great deal of pressure to provide a successful solution. Clearly, the time required to create test data or obfuscating personally identifiable information (PII) was not included in the original project plan.
Years later, I still see issues with several companies and their use of sensitive data as “test data”. This made me wonder how technology giants like Google, Apple, and Netflix handle working with sensitive data.
The Value of a Data Privacy Vault
The bigger issue with the reporting team’s choice was that the sensitive data they used for testing was easy to duplicate onto another data source. While my time cruising the reporting system was eye-opening, there could have been far worse consequences from this lapse in judgment.
Corporations which are required to comply with regulations around personally identifiable information (PII), payment card information (PCI), and healthcare records are already familiar with adhering to guidelines and standards around the isolation, encryption, and utilization of sensitive data. In my experience, these implementations have focused on the database tier.
The concept of a data privacy vault goes beyond a mere database. Data is secured by encrypting the information at rest, and kept usable with encryption technology that allows for certain operations on encrypted data. And all of this is wrapped into a cloud-based service offering.
Data Privacy Vault service providers offer the following benefits:
- Secure - Privacy already handled as part of the service design
- Isolated - Each customer works in their own space, access limited
- Store - Highly-available, mission-critical, with SQL-like access
- Manage - Granular access-control lists and policies
- Use - Data can only be used in a privacy-preserving manner
To understand how this approach works, let’s focus on a common use case.
An Example Use Case
Last year, my wife and I built our new home, which was a new experience for both of us. The time required from the day we signed the new construction contract until we received our keys at closing was almost 11 months. For the majority of that time, our personal information was in the hands of multiple entities that were making our dream home a reality.
For the mortgage company, there were likely three classifications (or roles) with access to our personal information. For simplicity, we can assume the roles are as follows:
- Customer Rep: This role can add and edit individual records, with full access to all sensitive data.
- Mortgage Analyst: This role has the ability to view all information, but sensitive data is masked or hidden.
- Reporting System: This role can view information, but cannot see the passcode column at all.
Now, let’s see how a Data Privacy Vault implementation could work for this mortgage company.
Getting to Know Skyflow
I recently started looking at a company called Skyflow because they offer a privacy API that’s a gateway to their Data Privacy Vault service. The fact that they recently landed $45 million in a series B round of funding also piqued my interest in their API.
Skyflow was founded in 2019 and their mission is to deliver Data Privacy Vaults via a simple and elegant API, so every customer can have best-of-breed data privacy.
Below is a simple illustration that demonstrates how Skyflow integrates with applications and services:
By design, Skyflow becomes another service-tier source for the client to utilize. When data needs to be protected, the client or consumer interacts directly with the Skyflow RESTful APIs.
When submitting new data to the Skyflow platform, the distinct tokens for new records are returned to the consumer, which then can be included in linked requests to other RESTful APIs in order to provide a link between the services.
Other service-tier applications and services often make API calls directly to Skyflow using purpose-driven service accounts - like the Reporting System role noted above. Because the account is tied to a role, the appropriate level of data will always be returned.
Privacy Vault Tip - the best and most secure design is going to tokenize the sensitive information as early as possible and detokenize as late as possible.
Taking Skyflow for a Test Drive
Using the mortgage company use case, I decided to take Skyflow for a test drive by launching the following URL:
https://www.skyflow.com/try-skyflow
After filling out some basic information, I received an email to help get started with Skyflow for free. Once I registered, I arrived at the Skyflow Studio UI.
Create a Custom Vault
The first thing I needed to do was create a new vault for the mortgage company’s Data Privacy Vault. I used the Create Vault | Start From Scratch option in the Skyflow UI.
I gave my new vault the name MortgageCompany and changed the default table name to customers.
Next, I defined a new schema in order to manage the following attributes:
- Social Security Number (SSN)
- Passcode (passcode)
Skyflow includes common PII elements (called Skyflow Data Types), which is what I decided to use for the SSN.
For the Passcode property, I went with the basic string data type.
With both of these properties set, I clicked the Create Vault button to finish the process.
I used the Insert Record option to add five new entries into this new customers table:
The skyflow_id
column represents the distinct key to access the data stored in customers of the MortgageCompany vault within Skyflow. This key will be added to the corresponding record in data stores that link to this data.
As an example, the PEOPLE table in a relational database would no longer contain the SSN and PASSCODE columns, but instead, it would store the SKYFLOW_ID as the bridge to Skyflow for each record in the PEOPLE table.
Configuring IAM
I navigated to the Settings | Vault screen of the Skyflow Studio UI and established the three roles noted above:
The roles included the following policies:
Customer Rep:
ALLOW ALL ON customers.* WITH REDACTION=PLAIN_TEXT
Mortgage Analyst:
ALLOW READ ON customers.* WITH REDACTION = DEFAULT
Reporting System:
ALLOW READ ON customers.skyflow_id WITH REDACTION = DEFAULT
ALLOW READ ON customers.ssn WITH REDACTION = DEFAULT
From there, I could create Users and Service Accounts in the MortgageCompany vault and grant the best role(s) for their position on the team.
Rather than create multiple accounts, I visited the Settings | Account screen in the Skyflow Studio UI, sent an invitation to my Gmail account, and then assigned the IAM role of Interop User for a new user named John J Vester.
Once my second account was activated, I was able to navigate to the Settings | Vault screen in the Skyflow Studio UI and use the Share Vault button to grant access to John J Vester with the Customer Rep role.
Going forward, I can use the John J Vester account to make RESTful API calls to Skyflow using basic cURL commands.
Skyflow in Action
In order to interact with the Skyflow RESTful APIs, we need to create a Bearer Token. While logged in to the Skyflow Studio UI using the John J Vester account, I single-clicked my user icon in the top-right corner and selected Generate API Bearer Token:
Next, I generated a short-term token to access Skyflow via their RESTful APIs:
Using my Bearer Token, the following cURL can retrieve data directly from Skyflow as a Customer Rep:
curl --location --request GET 'https://ebfc9bee4242.vault.skyflowapis.com/v1/vaults/ya45533150084aa08e474c982adc7dd7/customers' \
--header 'Authorization: Bearer ey … Dnt-w'
A 200 (OK) HTTP response was received, along with the following payload data:
{
"records":[
{
"fields":{
"passcode":"CDEF",
"skyflow_id":"0bd21a25-7d03-423b-b4da-699cdfbc2742",
"ssn":"345-67-8901"
}
},
{
"fields":{
"passcode":"EFGH",
"skyflow_id":"4c51fd3a-b137-479e-a85e-7931f6497e4c",
"ssn":"567-89-0123"
}
},
{
"fields":{
"passcode":"DEFG",
"skyflow_id":"72ac4e20-a88b-4add-9baa-56f9ae5c10d8",
"ssn":"456-78-9012"
}
},
{
"fields":{
"passcode":"ABCD",
"skyflow_id":"9124be90-aad6-4ba5-93e8-61c8d247f577",
"ssn":"123-45-6789"
}
},
{
"fields":{
"passcode":"BCDE",
"skyflow_id":"fb0b30c1-0ac7-40d7-a1b8-1be9c81ea89b",
"ssn":"234-56-7890"
}
}
]
}
As you can see, the Customer Rep role has access to see the plain text values for the passcode and SSN columns.
Next, I visited the Settings | Vault | Users screen in the Skyflow Studio UI and changed the role of John J Vester to be a Mortgage Analyst.
I re-ran the same cURL command, but this time the sensitive data was masked or partially redacted:
{
"records":[
{
"fields":{
"passcode":"*REDACTED*",
"skyflow_id":"0bd21a25-7d03-423b-b4da-699cdfbc2742",
"ssn":"XXX-XX-8901"
}
},
{
"fields":{
"passcode":"*REDACTED*",
"skyflow_id":"4c51fd3a-b137-479e-a85e-7931f6497e4c",
"ssn":"XXX-XX-0123"
}
},
{
"fields":{
"passcode":"*REDACTED*",
"skyflow_id":"72ac4e20-a88b-4add-9baa-56f9ae5c10d8",
"ssn":"XXX-XX-9012"
}
},
{
"fields":{
"passcode":"*REDACTED*",
"skyflow_id":"9124be90-aad6-4ba5-93e8-61c8d247f577",
"ssn":"XXX-XX-6789"
}
},
{
"fields":{
"passcode":"*REDACTED*",
"skyflow_id":"fb0b30c1-0ac7-40d7-a1b8-1be9c81ea89b",
"ssn":"XXX-XX-7890"
}
}
]
}
The Skyflow platform acted immediately on the changes made to the Vault settings and limited the results to the Mortgage Rep role.
Finally, I visited the Settings | Vault | Users screen in the Skyflow Studio UI and changed the role of John J Vester to be the Reporting System role.
Since the Reporting System role does not have access to the passcode column, the cURL was updated as shown below, limited the results to show only the skyflow_id
and the redacted SSN values:
curl --location --request GET 'https://ebfc9bee4242.vault.skyflowapis.com/v1/vaults/ya45533150084aa08e474c982adc7dd7/customers?fields=skyflow_id&fields=ssn' \
--header 'Authorization: Bearer ey ...Dnt-w'
As expected, only the masked SSN values were returned:
{
"records":[
{
"fields":{
"skyflow_id":"0bd21a25-7d03-423b-b4da-699cdfbc2742",
"ssn":"XXX-XX-8901"
}
},
{
"fields":{
"skyflow_id":"4c51fd3a-b137-479e-a85e-7931f6497e4c",
"ssn":"XXX-XX-0123"
}
},
{
"fields":{
"skyflow_id":"72ac4e20-a88b-4add-9baa-56f9ae5c10d8",
"ssn":"XXX-XX-9012"
}
},
{
"fields":{
"skyflow_id":"9124be90-aad6-4ba5-93e8-61c8d247f577",
"ssn":"XXX-XX-6789"
}
},
{
"fields":{
"skyflow_id":"fb0b30c1-0ac7-40d7-a1b8-1be9c81ea89b",
"ssn":"XXX-XX-7890"
}
}
]
}
Of course, this is merely a simple example to give a high-level overview of the benefits that the Data Privacy Vault API can provide for protected data.
Skyflow also provides a Postman Collection preconfigured with RESTFul APIs for your vault, which can be found within the plug icon menu:
Once imported into Postman, the vault queries look like this:
Conclusion
Skyflow asks the question “what if privacy had an API?” and delivers that API with a fully-fledged platform that allows customers of any size to protect their sensitive data through a collection of access-appropriate roles.
In the use case example, I demonstrated how quickly you can create a Data Privacy Vault using Skyflow, so you can easily integrate with any existing applications and services. This is a stark contrast to database-driven approaches which often involve a series of manually-driven tasks.
Since last year, I have been trying to live by the following mission statement, which I feel can apply to any IT professional:
“Focus your time on delivering features/functionality which extends the value of your intellectual property. Leverage frameworks, products, and services for everything else.”
- J. Vester
Skyflow certainly adheres to my personal mission statement by making it easy for feature and service team developers to leverage the Skyflow Data Privacy Vault and accompanying APIs, so that they can stay focused on meeting new objectives.
If your application landscape requires PCI, PII, PHI, GDPR, or HIPAA compliance and you are managing sensitive information by using manual or legacy processes, then it might be time to consider adding Skyflow to your short-term API adoption roadmap.
Have a really great day!
Top comments (1)
Interesting read, will def check out