You got your hands on some data that was leaked from a social network and you want to help the poor people.
Luckily you know a government service to automatically block a list of credit cards.
The service is a little old school though and you have to upload a CSV file in the exact format. The upload fails if the CSV file contains invalid data.
The CSV files should have two columns, Name and Credit Card. Also, it must be named after the following pattern:
YYYYMMDD
.csv.
The leaked data doesn't have credit card details for every user and you need to pick only the affected users.
The data was published here:
You don't have much time to act.
What tools would you use to get the data, format it correctly and save it in the CSV file?
Do you have a crazy vim configuration that allows you to do all of this inside your editor? Are you a shell power user and write this as a one-liner? How would you solve this in your favorite programming language?
Show your solution in the comments below!
Top comments (33)
PowerShell to the rescue!
$json = invoke-webrequest 'gist.githubusercontent.com/jorinvo...' | convertfrom-json
$json | select name,creditcard | export-csv "$(get-date -format yyyyMMdd).csv" -NoTypeInformation
Excellent, man
ramda-cli:
scala:
A oneliner if you're a linuxer 😉
However, there is something you have not mentioned in your post: Should the CSV file have the header line?
If yes, then use this:
This adds quotes.
Maybe adding this sed command:
Doesn't the second solution need a
>>
in the last line, so the output is appended?Yes, it does. (Didn't copy the correct version)
Thanks ☺
Aaaand Rust :)
Really an overkill for this task but fun nevertheless!
PHP:
You beat me to the PHP implementation. And your solution is so elegant.
Since the input JSON could be really large, here is a Node.JS steaming version (using stream-json package):
Nice! There is also csv-write-stream then you can save some code :)
Using the CSV module to avoid any quoting pitfalls. :)
Ruby is still one of the most pretty languages!
Maybe you can use the
open(url).read
fromrequire 'open-uri'
instead ofcurl
to allow it to run on other systems 🙂Alernatively could look like this:
Oh, I like that!
open-uri
built-in. Also awesome.Oneliner:
A few things to note:
cache
is a program I wrote that caches command-line invocations, it's to make it cheap to iterate (e.g. so you don't have to hit the network each time) github.com/JoshCheek/dotfiles/blob...My shell is fish (fishshell.com) which allows multi-line editing, and the parentheses in fish are like backticks in bash, so the
> (...)
is redirecting the output into a file whose name is the result of the...
Nice post!
All users share the same date. So I didn't bother and didn't write into separate files.
Another thing, I was going to write "Hey, that's not valid json you are giving us.", because I saw the objects are in a list and that list is not wrapped into an outer object. But my Python parser did not complain, so it turns out valid. You learn something new every day.
Having arrays on the top-level of JSON documents is indeed valid although it is definitely an anti-pattern. By doing so you block yourself from adding any meta information in the future.
If you build an API, you always want to wrap an array in an object. Then you can add additional fields like possible errors or pagination later on.
e.g.
Personally, I'd prefer the array in most cases. If I call an endpoint called customers, I would expect it to return an array of customers, not something that contains such an array, might or might not have an error and so on.
If I want to stream the response, I'd also be better off with an array, because whatever streaming library I use probably supports it.
Seems like json can have an array at the root, even according to the first standard: tools.ietf.org/html/rfc4627, section 2