Because yaml.load
is actually dangerousLoad
, and is as potentially dangerous as eval
.
There is also another method in Python, pickle, and it can be potentially as dangerous as YAML, while being more concise, and harder to edit than YAML. Sadly, I know no such library in Node / JavaScript.
So, safer method is actually JSON, which is highly editable, and you can provide how-to on serialization / deserialization.
Also, in Python, serializing a non-serializable will throw an Error (while in JavaScript, will mostly be defaulted to {}
, except for BigInt, for some reasons)
In Python, how-to is https://www.polvcode.dev/post/2019/09/custom-json, but I haven't done it for a while.
In JavaScript, it is that JSON.stringify
has replacer
and JSON.parse
has reviver.
How to identify typeof Anything
Firstly, you have know all custom types, if you want to serialize it, which is mostly easily done by instanceof
, but you cannot serialize with instanceof
...
So, I have identify it, first with typeof, but second step is manual. (By identifying what is serializable, otherwise it is not.)
// These are what are returned from typeof
export type TypeNativeSerializable = 'string' | 'number' | 'boolean'
export type TypeNativeNonSerializable = 'bigint' | 'symbol' | 'undefined' | 'function' | 'object'
// These are what aren't returned, but do have syntactic meaning. Have to be derived.
export type TypeExtra = 'Null' | 'Array' | 'Named' | 'Constructor' | 'NaN' | 'Infinity
And, I identify what will be hard to serialize, with some meanings.
const specialTypes: Record<TypeExtra | TypeNativeNonSerializable, any[]> = {
Null: [null],
NaN: [NaN],
Named: [new NamedClassWithMethods(), new NamedClassWithoutMethods()],
Infinity: [Infinity, -Infinity],
Array: [new Array(5)],
Constructor: [NamedClassWithMethods, NamedClassWithoutMethods, Array, NamedArray],
bigint: [BigInt(900719925474099133333332)],
symbol: [Symbol('hello')],
undefined: [undefined],
object: [{ a: 1 }],
function: [function fnLiteral (a: any) { return a }, (b: any) => b]
}
And now, type identification -- the hard topic here is how-to-check-if-a-javascript-function-is-a-constructor...
Serialization
To cut things short, I have identified how-to-serialize for most native objects in my library. There is no dependencies, and works in both browser and Node (but I haven't polyfill / shim for older browsers yet.)
patarapolw / any-serialize
Serialize any JavaScript objects, as long as you provides how-to. I have already provided Date, RegExp and Function.
But I disable undefined
serialization by default (i.e. undefined
is excluded by default), but you might want to enable it. (I did so in my test.)
Most serialization is done by both .toString()
and caching typeof Object.
RegExp objects are a little special. .toString()
is hard to reconstruct, so I use RegExp#source and RegExp#flags instead.
Hashing
There are some problematic topics here.
-
JSON.stringify
doesn't reliably sort keys. - You cannot supply both
function replacer
andsorter
toJSON.stringify
- How do I hash functions and classes
- Symbols should always be unique.
- Key collision
I have provided how to sort keys without a library, via JSON.stringify
, with Array as the second argument. Only that you have to cache all Object keys, including nested.
const clonedObj = this.deepCloneAndFindAndReplace([obj])[0]
const keys = new Set<string>()
const getAndSortKeys = (a: any) => {
if (a) {
if (typeof a === 'object' && a.constructor.name === 'Object') {
for (const k of Object.keys(a)) {
keys.add(k)
getAndSortKeys(a[k])
}
} else if (Array.isArray(a)) {
a.map((el) => getAndSortKeys(el))
}
}
}
getAndSortKeys(clonedObj)
return this.stringifyFunction(clonedObj, Array.from(keys).sort())
I also deepCloneAndFindAndReplace the Object here, both to "supply both function replacer
and sorter
to JSON.stringify
" and to prevent original Object modification on replace.
How do I hash functions and classes
For functions, I replace all whitespaces, but a proper and better way is probably to minify to stringified code. (Didn't put in my code to avoid adding dependencies.)
export const FullFunctionAdapter: IRegistration = {
key: 'function',
toJSON: (_this) => _this.toString().trim().replace(/\[native code\]/g, ' ').replace(/[\t\n\r ]+/g, ' '),
fromJSON: (content: string) => {
// eslint-disable-next-line no-new-func
return new Function(`return ${content}`)()
}
}
For classes, you will need to objectify it.
/**
* https://stackoverflow.com/questions/34699529/convert-javascript-class-instance-to-plain-object-preserving-methods
*/
export function extractObjectFromClass (o: any, exclude: string[] = []) {
const content = {} as any
Object.getOwnPropertyNames(o).map((prop) => {
const val = o[prop]
if (['constructor', ...exclude].includes(prop)) {
return
}
content[prop] = val
})
return o
}
It is possible to hash without a library. You just need to know the code.
/**
* https://stackoverflow.com/questions/7616461/generate-a-hash-from-string-in-javascript
*
* https://stackoverflow.com/a/52171480/9023855
*
* @param str
* @param seed
*/
export function cyrb53 (str: string, seed = 0) {
let h1 = 0xdeadbeef ^ seed; let h2 = 0x41c6ce57 ^ seed
for (let i = 0, ch; i < str.length; i++) {
ch = str.charCodeAt(i)
h1 = Math.imul(h1 ^ ch, 2654435761)
h2 = Math.imul(h2 ^ ch, 1597334677)
}
h1 = Math.imul(h1 ^ h1 >>> 16, 2246822507) ^ Math.imul(h2 ^ h2 >>> 13, 3266489909)
h2 = Math.imul(h2 ^ h2 >>> 16, 2246822507) ^ Math.imul(h1 ^ h1 >>> 13, 3266489909)
return 4294967296 * (2097151 & h2) + (h1 >>> 0)
}
- Preventing key collision
- Symbols are always unique
This is as simple as Math.random().toString(36).substr(2)
, but you might use a proper UUID.
Deserialization isn't always safe, but it isn't required for hashing
In the end, it is the same as YAML and pickle, so you have to choose what to be deserialize properly.
I exclude Function deserialization by default, by removing fromJSON
method.
export const WriteOnlyFunctionAdapter: IRegistration = {
...FullFunctionAdapter,
fromJSON: null
}
If you are only needing it for MongoDB, you don't need a library at all
Because the code is here. https://gist.github.com/patarapolw/c9fc59e71695ce256b442f36b93fd2dc
const cond = {
a: new Date(),
b: /regexp/gi
}
const r = JSON.stringify(cond, function (k, v) {
const v0 = this[k]
if (v0) {
if (v0 instanceof Date) {
return { $date: v0.toISOString() }
} else if (v0 instanceof RegExp) {
return { $regex: v0.source, $options: v0.flags }
}
}
return v
})
console.log(r)
console.log(JSON.parse(r, (_, v) => {
if (v && typeof v === 'object') {
if (v.$date) {
return new Date(v.$date)
} else if (v.$regex) {
return new RegExp(v.$regex, v.$options)
}
}
return v
}))
Summary
This library has no dependencies, and tested for most native objects.
patarapolw / any-serialize
Serialize any JavaScript objects, as long as you provides how-to. I have already provided Date, RegExp and Function.
The demo is here https://patarapolw.github.io/any-serialize/, and tested that this can be hashed.
const obj = {
a: new Date(),
r: /^hello /gi,
f: (a, b) => a + b,
s: new Set([1, 1, 'a']),
c: new XClass(),
miscell: [
NaN,
Infinity,
BigInt(900719925474099133333332),
function fnLiteral (a) { return a }
]
}
Top comments (0)