DEV Community

Cover image for Flattening JSON in JSON.NET
Matthew Watkins
Matthew Watkins

Posted on • Originally published at anotherdevblog.net

Flattening JSON in JSON.NET

Originally published on AnotherDevBlog.

The use case

I've got an interesting problem at work where I need to take any arbitrary JSON blob (object or array) and represent the leaf nodes in memory as a collection of key/value pairs. For example, given this JSON:

[
  {
    "Name": "Fish",
    "Color": "Silver",
    "Attributes": [
      {
        "Name": "Environment",
        "Value": "Aquatic"
      },
      {
        "Name": "Parts",
        "Value": [
          {
            "Type": "fin",
            "Length": 3
          }
        ]
      }
    ]
  }
]
Enter fullscreen mode Exit fullscreen mode

I want to see something like this output:

[0].Name = Fish
[0].Color = Silver
[0].Attributes[0].Name = Environment
[0].Attributes[0].Value = Aquatic
[0].Attributes[1].Name = Parts
[0].Attributes[1].Value[0].Type = fin
[0].Attributes[1].Value[0].Length = 3
Enter fullscreen mode Exit fullscreen mode

Here's what I did, and the lessons I learned

Lesson 1: Not everything is on StackOverflow

"This sounds like a pretty common use case," I said to myself, "surely there is something on the documentation or StackOverflow."

Nope. I searched StackOverflow for quite a while, and while I found a few answers referring to Java libraries, I couldn't find one for the Json.NET library we are using here. The most popular NuGet package on the internet and no one has ever faced this issue before? Seriously?!

It took a few hours and lots of debugging, but I eventually wrote an extension method to allow me to grab the leaf node values of arbitrary JSON:

public static class JExtensions
{
    public static IEnumerable<JValue> GetLeafValues(this JToken jToken)
    {
        if (jToken is JValue jValue)
        {
            yield return jValue;
        }
        else if (jToken is JArray jArray)
        {
            foreach (var result in GetLeafValuesFromJArray(jArray))
            {
                yield return result;
            }
        }
        else if (jToken is JProperty jProperty)
        {
            foreach (var result in GetLeafValuesFromJProperty(jProperty))
            {
                yield return result;
            }
        }
        else if (jToken is JObject jObject)
        {
            foreach (var result in GetLeafValuesFromJObject(jObject))
            {
                yield return result;
            }
        }
    }

    #region Private helpers

    static IEnumerable<JValue> GetLeafValuesFromJArray(JArray jArray)
    {
        for (var i = 0; i < jArray.Count; i++)
        {
            foreach (var result in GetLeafValues(jArray[i]))
            {
                yield return result;
            }
        }
    }

    static IEnumerable<JValue> GetLeafValuesFromJProperty(JProperty jProperty)
    {
        foreach (var result in GetLeafValues(jProperty.Value))
        {
            yield return result;
        }
    }

    static IEnumerable<JValue> GetLeafValuesFromJObject(JObject jObject)
    {
        foreach (var jToken in jObject.Children())
        {
            foreach (var result in GetLeafValues(jToken))
            {
                yield return result;
            }
        }
    }

    #endregion
}
Enter fullscreen mode Exit fullscreen mode

Then in my calling code, I just extract the Path and Value properties from the JValue objects returned:

var jToken = JToken.parse("blah blah json here");
foreach (var jValue in jToken.GetLeafValues()
{
    Console.WriteLine("{jValue.Path} = {jValue.Value}");
}
Enter fullscreen mode Exit fullscreen mode

Awesome!

Lesson 2: But it's always on StackOverflow

So it turns out there is an answer on StackOverflow for this use case (link). I was searching for terms like "get all leaf nodes" or "get all values with paths," but the magic keyword to make the answer appear is "flatten." Here's the answer code that was posted:

JObject jsonObject=JObject.Parse(theJsonString);
IEnumerable<JToken> jTokens = jsonObject.Descendants().Where(p => p.Count() == 0);
Dictionary<string, string> results = jTokens.Aggregate(new Dictionary<string, string>(), (properties, jToken) =>
{
    properties.Add(jToken.Path, jToken.ToString());
    return properties;
});
Enter fullscreen mode Exit fullscreen mode

Lesson 3: But you can't always just copy what's on StackOverflow

Wow, that code snippet is a lot shorter than my solution, so I tried it out. But ultimately went back to my own. Here's why:

  1. This solution doesn't work, at least not for my case. See, I need it to handle an arbitrary JSON blob. I can't promise it's going to be a JObect-- it could be an array or something else, so this solution, unfortunately, fails for me out the gate with my first test case (an array). And JToken doesn't have a handy little Descendants() method I can call like JObject does, so I'd have to do some type checking anyway. Yuck.
  2. Another problem: this solution builds a dictionary in memory to represent the flattened structure. I'm dealing with some pretty massive objects, and it's already painful enough to load up that initial JToken. I'd really rather not add the memory pressure of the dictionary on top of that.
  3. Speaking of memory, I'd like to (eventually) only return the JValue if it's not null or default for the value type.
  4. That .Count() looks really expensive since it's a method being called on every single descendant, whether you end up using the descendant at all. Probably safer to just select only descendants that you know are JValue objects: .Descendants().OfType<JValue>(). Then you can call .Value. And when you have a JValue object, you can call .Value and get the underlying primitive (or pseudo-primitive string) value without calling the .ToString().

Top comments (2)

Collapse
 
lluismf profile image
Lluís Josep Martínez

I don't know if there's something like a JSON streaming parser (in Java there's a SAX streaming parser).

If it exists, a single loop plus a stack should be enough to get the desired output.
And the memory consumption would be O(1).

Collapse
 
czarofbears profile image
Andrew S

Thank you for the article, fully agree on every point! The funny thing is I just finished long and unsuccessful search at SO for exactly that issue and twitter notified me about this post.