Discussion on: CSV Challenge

View post

Nice post!

import json
from csv import DictWriter

with open("data.json", "r") as f:
    users = json.load(f)

cols = ["name", "creditcard"]
with open("20150425.csv", "w", newline='') as f:
    dw = DictWriter(f, cols)
    dw.writeheader()
    for u in users:
        if u["creditcard"]:
            dw.writerow({k: u[k] for k in cols})

All users share the same date. So I didn't bother and didn't write into separate files.
Another thing, I was going to write "Hey, that's not valid json you are giving us.", because I saw the objects are in a list and that list is not wrapped into an outer object. But my Python parser did not complain, so it turns out valid. You learn something new every day.

Tobias Salzmann • Nov 11 '17 • Edited

Seems like json can have an array at the root, even according to the first standard: tools.ietf.org/html/rfc4627, section 2

jorin • Nov 11 '17 • Edited

Having arrays on the top-level of JSON documents is indeed valid although it is definitely an anti-pattern. By doing so you block yourself from adding any meta information in the future.
If you build an API, you always want to wrap an array in an object. Then you can add additional fields like possible errors or pagination later on.
e.g.

{
  "data": [],
  "status": "not ok",
  "error": { "code": 123, "message": "..." },
  "page": 42
}

Tobias Salzmann • Nov 12 '17

Personally, I'd prefer the array in most cases. If I call an endpoint called customers, I would expect it to return an array of customers, not something that contains such an array, might or might not have an error and so on.
If I want to stream the response, I'd also be better off with an array, because whatever streaming library I use probably supports it.