DEV Community: Appcanary

We Left Clojure. Here's 5 Things I'll Miss.

Max Veytsman — Mon, 23 Jan 2017 16:36:44 +0000

On October 11th, Appcanary relied on about 8,500 lines of clojure code. On the 12th we were down to zero. We replaced it by adding another 5,700 lines of Ruby to our codebase. Phill will be discussing why we left, and what we learned both here and at this year's RubyConf. For now, I want to talk about what I'll miss.

1) The joy of Lisp

There's something magical about writing lisp. Alan Kay called it the greatest single programming language ever devised. Paul Graham called it a secret weapon. You can find tens of thousands of words on the elegant, mind-expanding powers of lisp¹. I don't think my version of the Lisp wizardry blog post would be particularly original or unique, so if you want to know more about the agony and ecstasy of wielding parenthesis, read Paul Graham.

What's great about Clojure is that while Ruby might be an acceptable lisp,
and lisp might not be an acceptable lisp, Clojure is a more than acceptable lisp. If we avoid the minefield of type systems, Clojure addresses the other 4 problems Steve Yegge discusses in the previous link².

2) Immutability

The core data structures in clojure are immutable. If I define car to be "a dirty van", nothing can ever change that. I can name some other thing car later, but anything referencing that first car will always be referencing "a dirty van".

This is great for a host of reasons. For one, you get parallelization for free — since nothing will mutate your collection, mapping or reducing some function over it can be hadooped out to as many clouds as you want without changing your algorithms.

It's also much easier to can reason about your code. There's a famous quote by Larry Wall:

[Perl] would prefer that you stayed out of its living room because you weren't invited, not because it has a shotgun.

He was talking about private methods, but the same is true for mutability in most languages. You call some method and who knows if it mutated a value you were using? You would prefer it not to, but you have no shotgun, and frankly it's so easy to mutate state without even knowing that you are. Consider Python:

str1 = "My name "
str2 = str1
str1 += "is Max"
print str1
# "My name is Max"
print str2
# "My name"

list1 = [1, 2, 3]
list2 = list1
list1 += [4, 5]
print list1
# [1, 2, 3, 4, 5]
print list2
# [1, 2, 3, 4, 5]

Calling += on a string returned a new one, while calling += on a list mutated it in place! I have to remember which types are mutable, and whether += will give me a new object or mutate the existing one depending on its type. Who knows what might happen when you start passing your variables by reference to somewhere else?

Not having the choice to mutate state is as liberating as getting rid of your Facebook account.

3) Data first programming

Walking away from object-oriented languages is very freeing.

I want to design a model for the game of poker. I start by listing the nouns³: "card", "deck", "hand", "player", "dealer", etc. Then I think of the verbs, "deal", "bet", "fold", etc.

Now what? Here's a typical StackOverflow question demonstrating the confusion that comes with designing like this. Is the dealer a kind of player or a separate class? If players have hands of cards, how does the deck keep track of what cards are left?

At the end of the day, the work of programming a poker game is codifying all of the actual rules of the game, and these will end up in a Game singleton that does most of the work anyway.

If you start by thinking about data and the functions that operate on it, there's a natural way to solve hard problems from the top-down, which lets you quickly iterate your design (see below). You have some data structure that represents the game state, a structure representing possible actions a player can take, and a function to transform a game state and an action into the next game state. That function encodes the actual rules of poker (defined in lots of other, smaller functions).

I find this style of programming very natural and satisfying. Of course, you can do this in any language; but I find Clojure draws me towards it, while OO languages push me away from it.

4) Unit Testing

The majority of your code is made up of pure functions. A pure function is one which always gives the same output for a given input — doesn't that sound easy to test? Instead of setting up test harnesses databases and mocks, you just write tests for your functions.

Testing the edges of your code that talk to the outside world requires mocking, of course, and integration testing is never trivial. But the first thing you want to test is the super-complicated piece of business logic deep in your codebase. The business logic your business depends on, like for instance computing whether your version of OpenSSL is vulnerable to HeartBleed.

Clojure pushes you to make that bit of code a pure function that's testable without setting up complicated state.

5) Refactoring

Here's a typical clojure function

(defn foo [a b]
  ;; some code here
  (let [c (some-function a b)]
    ;; a ton of 
    ;; complicated code here
)))

In lisp-speak, a parenthesized block is called a "form". The foo form is the outer form, and it contains the let form, which ostensibly contains other forms that do complicated things.

I know that all the complicated code inside of the let form isn't going to mutate any state, and that it's only dependent on the a and b variables. This means that refactoring this code out into its own functions is as trivial as selecting everything between two matching parentheses and cutting and pasting it out. If you have an editor that supports paredit-style navigation of lisp forms, you can rearrange code at lightning speed.

My favourite essay of this ilk is Mark Tarver's melancholy
The Bipolar Lisp Programmer.
He describes lisp as a language designed by and for brilliant failures. Back in university, I ate this shit up. My grades were obvious evidence of half the requirement of being a lisp programmer. ↩
I'm aware that clojure's gensym does not a hygenic macro system make. But, if you have strong opinions on hygenic macros as they relate to acceptable lisps, this article might not be for you. ↩
For the record, I know that this isn't the "right" way to design OO programs, but the fact that I have to acknowledge this proves my point. ↩

Should you encrypt or compress first?

Max Veytsman — Tue, 17 Jan 2017 21:27:27 +0000

Imagine this:

You work for a big company. Your job is pretty boring. Frankly, your talents are wasted writing boilerplate code for an application whose only users are three people in accounting who can't stand the sight of you.

Your real passion is security. You read r/netsec every day and participate in bug bounties after work. For the past three months, you've been playing a baroque stock trading game that you're winning because you found a heap-based buffer overflow and wrote some AVR shellcode to help you pick stocks.

Everything changes when you discover that what you had thought was a video game was actually a cleverly disguised recruitment tool. Mont Piper, the best security consultancy in the world, is hiring — and you just landed an interview!

A plane ride and an Uber later, you're sitting across from your potential future boss: a slightly sweaty hacker named Gary in a Norwegian metal band t-shirt and sunglasses he refuses to take off indoors.

You blast through the first part of the interview. You give a great explanation of the difference between privacy and anonymity. You describe the same origin policy in great detail, and give three ways an attacker can get around it. You even whiteboard the intricacies of __fastcall vs __stdcall. Finally, you're at the penultimate section, protocol security.

Gary looks you in the eyes and says: "You're designing a network protocol. Do you compress the data and then encrypt it, or do you encrypt and then compress?" And then he clasps his hands together and smiles to himself.

A classic security interview question!

Take a second and think about it.

At a high level, compression tries to use patterns in data in order to reduce its size. Encryption tries to shuffle data in such a way that without the key, you can't find any patterns in the data at all.

Encryption produces output that appears random: a jumble of bits with a lot of entropy. Compression doesn't really work on data that appears random — entropy can actually be thought of as a measure of how "compressable" some data is.

So if you encrypt first, your compression will be useless. The answer must be to compress first! Even StackOverflow thinks so.

You start to say this to Gary, but you stop mid-sentence. An attacker sniffing encrypted traffic doesn't get much information, but they do get to learn the length of messages. If they can somehow use that to learn more information about the message, maybe they can foil the encryption.

You start explaining this to Gary, and he interrupts you — "Oh you mean like the CRIME attack?"

"Yes!" you reply. You start to recall the details of it. All the SSL attacks with catchy names are mixed together in your mind, but you're pretty sure that's the one. They controlled some information that was being returned by the server, and used that to generate guesses for a secret token present in the response. The response was compressed in such a way that you could validate guesses for the secret by seeing how you affected the length of the compressed message. If the secret was AAAA and you guessed AAAA, the compressed-then-encrypted response will be shorter than if you guessed BBBB.

Gary looks impressed. "But what if the attacker can't control any of the plaintext in any way? Is this kind of attack still possible?" he asks.

CRIME was a very cool demonstration of how compress-then-encrypt isn't always the right decision, but my favorite compress-then-encrypt attack was published a year earlier by Andrew M. White, Austin R. Matthews, Kevin Z. Snow, and Fabian Monrose. The paper Phonotactic Reconstruction of Encrypted VoIP Conversations gives a technique for reconstructing speech from an encrypted VoIP call.

Basically, the idea is this: VoIP compression isn't going to be a generic audio compression algorithm, because we can rely on some assumptions about human speech in order to compress more efficiently. From the paper:

Many modern speech codecs are based on variants of a well-known speech coding scheme known as code-excited linear prediction (CELP) [49], which is in turn based on the source-filter model of speech prediction. The source-filter model separates the audio into two signals: the excitation or source signal, as produced by the vocal cords, and the shape or filter signal, which models the shaping of the sound performed by the vocal tract. This allows for differentiation of phonemes; for instance, vowels have a periodic excitation signal while fricatives (such as the [sh] and [f] sounds) have an excitation signal similar to white noise [53].

In basic CELP, the excitation signal is modeled as an entry from a fixed codebook (hence code-excited). In some CELP variants, such as Speex's VBR (variable bit rate) mode, the codewords can be chosen from different codebooks depending on the complexity of the input frame; each codebook contains entries of a different size. The filter signal is modeled using linear prediction, i.e., as a so-called adaptive codebook where the codebook entries are linear combinations of past excitation signals. The “best entries from each codebook are chosen by searching the space of possible codewords in order to “perceptually optimize the output signal in a process known as analysis-by-synthesis [53]. Thus an encoded frame consists of a fixed codebook entry and gain (coefficient) for the excitation signal and the linear prediction coefficients for the filter signal.

Lastly, many VoIP providers (including Skype) use VBR codecs to minimize bandwidth usage while maintaining call quality. Under VBR, the size of the codebook entry, and thus the size of the encoded frame, can vary based on the complexity of the input frame. The specification for Secure RTP (SRTP) [3] does not alter the size of the original payload; thus encoded frame sizes are preserved across the cryptographic layer. The size of the encrypted packet therefore reflects properties of the input signal; it is exactly this correlation that our approach leverages to model phonemes as sequences of lengths of encrypted packets.

That pretty much summarizes the paper. CELP + VBR means that message length is going to depend on complexity. Due to how linear prediction works, more information is needed to encode a drastic change in sound — like the pause between phonemes! This allows the authors to build a model that can break an encrypted audio signal into phonemes: that is, deciding which audio frames belong to which unit of speech.

They then built a classifier that, still only using the packet length information they started with, decides which segmented units of encrypted audio represent which actual phonemes. They then use a language model to correct the previous step's output and segment the phoneme stream into words and then phrases.

The crazy thing is that this whole rigmarole works! They used a metric called METEOR and got scores of around .6. This is on a scale where <.5 is considered "interpretable by a human." Considering that the threat vector here is a human using this technique to listen in on your encrypted VoIP calls — that's pretty amazing!

Epilogue

After passing the rigorous all-night culture fit screening, you end up getting the job. Six months later, Mont Piper is sold to a large conglomerate. Gary refuses to trade in his Norwegian metal t-shirts for a button-down and is summarily fired. You now spend your days going on-site to a big bank, "advising" a team that hates your guts.

But recently, you've picked up machine learning and found this really cool online game where you try to make a 6-legged robot walk in a 3d physics simulation...

This article was originally posted on the Appcanary blog.