Recently I was involved in a project were I had to load data from different providers, each one providing data in a different format,
JSON in the best case but even in
Received data was dirty and pretty far from being the data you would like to receive, for example:
- Extra white space in values
- Boolean values expressed using words like
- Datetime values always expressed in different formats
- Keys didn't respect a standard naming convention like
- Unexpected data structure changes without any warning by the provider
To wake me up from this nightmare I decided to normalize and simplify the whole process reducing unexpected errors near to zero.
The abstract problem was always the same:
- Reading/writing data from/to different formats in a standard way
- Accessing nested data values quickly, using keypath
- Get data values trying to parse them in the expected type
I decided to write my own library.
python-benedict is a
dict subclass with keypath support, I/O shortcuts (
query-string) and many utility methods.
It's open-source on GitHub:
Check it out, any feedback is appreciated.