Differential privacy for dummies (2016) - differential privacy is a way of working with data which doesnβt provide information about any one person / item in the data set. A neat example is asking people to identify as criminals/drug users etc. There is a protocol of adding noise to the data, which can be done by each person being asked in turn, so that (1) you donβt know if any one person is actually in the target class and (2) you can compute population queries such as total number of people in the class, proportion of people in the class etc. Thereβs a bunch more examples there as well as a brief theoretical overview. Interesting stuff for the age of big data and big surveillance.
SSTable and log structured Storage: LevelDB (2012) - weβve covered Log-structured merge-tree here before, but mostly in the context of their use in Big Data storage engines like BigTable, HBase, Cassandra etc. LevelDB is a storage engine built with the same methods, but aimed at βsmall dataβ. Itβs an alternative to SQLite or flat files. The article goes into some details about SSTables, which are the underlying data structure in many database engines (of the NoSQL variety at least).
Protocol buffers, Avro, Thrift & Message Pack (2011) - just a bit of commentary on the various serialization and RPC mechanisms.
Measuring & optimizing I/O performance (2009) - written in the age of disks, thereβs still a bit of interesting stuff here. Mainly around the use of iostat.
Terraforming Stack Overflow Enterprise in AWS (2018) - the story of how Palantir βproductionizedβ their install of Stack Overflow Enterprise. A nice overview of the SO architecture is included as well. If youβre a big company looking for a knowledge management solution, do know that SOE exists.
Evaluating options for Amazonβs HQ2 using Stack Overflow data (2018) - a data journalism article from our data teamβs Julia Silge. Amazon is looking for a place to setup their second headquarters in the US and are running a sort of contest for cities. Much like the Olympics it seems like itβs a losing proposition for cities, but they canβt afford not to participate. Julia looked at technical consideration of the workforce in the 20 finalist cities to see how well theyβd match the things Amazon would need.
Oldest comments (0)