As part of a recent hackathon project, I pulled together some data insights from AppMap data.
For those unfamiliar, the AppMap framework is a suite of tools that allows developers to record runtime data from their applications. An AppMap client, essentially an agent, is responsible for capturing this data from a live application. It emits a file containing the recorded execution flow, data snapshots, high level process I/O (think HTTP, SQL queries, etc.) and some application metadata. I figured it would be an interesting experiment to identify different types of data accessed and label their usage.
Here's what I came up with in two days: The AppMap data inspector. The demo link contains some example data so there's no need to come up with your own.
This proof of concept makes some attempt at identifying the following:
- Sensitive values such as passwords, auth tokens
- Encrypted values such as password hashes
- Unencrypted values which should be encrypted
- Data persisted within a database
- Data provided by a user, such as an HTTP request parameter
- Personally identifiable information: SSNs, emails, IP addresses (this may not count as PII, but I left it in for the proof of concept)
- Data logged to
A single object can have multiple labels. PII in your application logs? Uh oh! Sensitive data persisting in your database unencrypted? Whoops! References to unencrypted passwords all over the place? Might be a code smell...
I see a ton of potential improvements for this project, but I'm inclined to let it live on as a proof of concept for now.
If you have some ideas of your own, I'd love to hear them in the comments below!
Thanks for reading!