Sometimes, when we’re working with data, we need to be sensitive with regards to what we’re exposing when we serialize our objects. On a recent project, I was dealing with various types of Personally Identifiable Information, often abbreviated as PII, and needed to find a consistent, and easy, way to mark fields as sensitive. I discussed the problem with my team and after a few rounds of spit-balling and bouncing ideas off each other, the
CensoredContentAttribute was born.
Note: All the code discussed in this article can be found in this Gist. Eventually, I plan to set it up in a repo with some other useful classes and snippets, but for now, GitHub Gist is its home away from home.
To start, let’s preview an example of how this attribute works in practice.
The important bits are on line 9,
[Redacted(ShowFirst:3, ShowLast:3)], and on line 13
RedactedAttribute to the Name property of this
LoggingExample class sets it up to be read in when we go to serialize this class later on in the
ToSerializedString method. The
ShowLast properties of the
RedactedAttribute provide us a way to specify how many characters at the start and end of the original value should be shown in the serialized output. Very handy when serializing things like Social Security Numbers, Email Addresses, Phone Numbers, or Usernames where some fields may have different amounts of data permissible to view and others need to be fully censored.
CensoredJsonSerializerSettings<T> is a class derived from
JsonSerializerSettings with constraints on the type of
T, limiting it to children of the
CensoredContentAttribute. This class helps to provide the necessary
ContractResolver to the
JsonSerializer so that your censored fields are handled properly.
Next, we move on to the core of the attribute, with the
CensoredContentContractResolver<T>, and the
CensoredContentValueProvider. Here’s the code:
Lines 8–13 are the abstract
CensoredContentAttribute, which defines a base class for us to attach most of our logic to, and more importantly, gives us a strong type to attach our generic type constraints to. It’s a simple class with one read only property,
Censor, and one method,
TruncateData(string input). The Censor property is for defining what value will be used to censor the content. In our example attribute,
RedactedAttribute, we use the value
***REDACTED***. Similarly, the
TruncateData method does what it says on the tin, it’s for performing the actual truncation of your data.
Lines 15–28 (technically through 50, as the
CensoredContentValueProvider is an internal class, but we’ll get to that in a moment) make up the core of the
CensoredContentContractResolver<T> class. This class is what the Netwonsoft Json Serializer is going to use to determine if a member on the object being serialized should use the default value provider, or if it should use our custom
CensorContentValueProvider. Overall, it’s pretty straight forward and ensures, through reflection, that the member we’re getting the value of has a child of the
CensoredContentAttribute on it. We get a reference to that custom attribute and pass it into the constructor of the
CensorContentValueProvider along with a reference to the standard
Lines 30–49 make up the
CensorContentValueProvider class, where the real magic happens. In here, we have a couple properties for holding references to the values passed in, namely the base
IValueProvider and the concrete implementation of our attribute, wrapped in a
CensoredContentAttribute shell. In the
GetValue method (line 39), we first need to pull out the string value of the member being censored. It’s a standard call to the
GetValue method of the
IValueProvider interface, but we need to make sure to toss in some null check and null coalescing so we don’t error out if the member hasn’t been set. After that, we throw our
targetString into the
TruncateData method of our attribute class and return the result.
SetValuemethod isn’t particularly important in this case, since this is intended as a one way censoring attribute, but there are plenty of interesting things you could do there, including things like encryption when serializing to/from json. I’m interested to see what ideas you have. Let me know in the comments what you come up with.
That covers the groundwork and the core of our censorship engine, so now lets dive in to a couple example attributes. Here’s the code:
Starting with the
LargeDataAttribute, we have an example use case for the
CensoredContentAttribute that doesn’t involve direct censorship for privacy reasons, but rather for log size reasons. In our case, we were trying to store a raw copy of the body from an
HttpRequest during development for debugging and replay purposes, but for some requests which included file uploads, this was excessively large. Utilizing the
LargeDataAttribute we were able to specify a threshold over which the data on the decorated member would be replaced with a message like “
***Large Data Removed*** [Length: 1024]”. This greatly eased the burden on our logging tools, and provided an easy way to get a general feel for what was included without being excessively large.
RedactedAttribute is by far our most used of the two attributes though. With it, we are able to flag members as
Redacted and specify a varying amount of data that is okay to display in the logs when serialized. In the
TruncateData method, we check to ensure that the combined length of the first and last characters to show do not exceed the length of the original input. If it does, we spit out the value of the
Censored property only (
***REDACTED*** in this case). Otherwise, we append the first X (where X is the value of
ShowFirst) characters of the input to our output string, append the
Censor value, then perform the same action for the number of characters specified by the
ShowLast property. Finally, we run one final sanity check on the output and return the value, or the
Censored value, if our planned output was somehow empty. This ensures that someone looking at the log does not know if the original input value was originally empty, or contained less characters than the supplied
Censor string (that the reason the
Censor string is padded with asterisk as well).
Packaging it all together, you end up with a pretty straight forward way to mark members of your classes as sensitive and perform some form of modification to the data before putting it through serialization. If you need to serialize the same object without the redaction, such as for writing to a DB, posting to an endpoint, or writing out to a file, you simply run it through the serializer without sending in the
CensoredJsonSerializerSettings. Additionally, the
<T> value of the
CensoredJsonSerializerSettings allows you to use multiple children of the
CensoredContentAttribute in a single class and optionally trigger specific censor attributes based on the value you send in as
<T>. For example, we use the
<RedactedAttribute> value for
<T> when serializing for logs, and the
<LargeDataAttribute> when serializing for the DB.
Custom attributes provide a world of options and capabilities that every developer tends to find their own unique way of using. The
CensoredContentAttribute is just one of the custom attributes we utilize here at DealerOn and I plan on covering more of our helpful snippets and libraries in the future. Thanks for checking this out, and I look forward to reading how you are utilizing custom attributes in you’re own projects.