This post was originally published on my blog
Earlier this year I saw a post on Mashable which had some amazing photos from World War Two. They really bring the era to life so I was thinking about how we could automate the process of colourising photos and save a ton of ðŸ’° - it can cost up to Â£3k/minute to colourise video professionally.
DockerCon EU was coming up and I really wanted to attend. My idea sounded like the perfect solution to my problem! Enlisting the help of my friend Oli Callaghan, we started writing code. We tried a couple of different approaches, notably writing our own machine learning framework from scratch in C++ and training a tensorflow network. When it came down to it, these solutions were always limited by one thing: time. DockerCon was in a few weeks time and our networks were taking far too long to train (we're talking weeks).
So, we were forced to go with another approach. We decided to build on the work by Richard Zhang et al and use their prebuilt model to colourise our photos. We were able to successfully deploy their Caffe model at scale with the open source serverless framework OpenFaaS. Using OpenFaaS allowed us to concentrate on the integration of the colourisation, instead of the underlying infrastructure.
Having got the colourisation working in a serverless function, we decided to extend it so that our audience at DockerCon could get involved.
Exploiting the flexibility of OpenFaaS, we wrote some extra serverless functions to allow people to tweet black and white images to @colorisebot and for it to reply back with the colourised image.
This is what our function stack looks like:
Okay, so we can colourise photos. However, our main goal was always to colourise videos. How did this go? Pretty well!
Here's how it works. We simply split the video up into frames and then drop them into the OpenFaaS function stack, gather up the colourised frames and stitch them back together with ffmpeg.
The OpenFaaS community has been very supportive of us, helping test, develop and inspire Oli and I to keep improving the project. Although the original idea was ours, the development and deployment has been a group effort.
In particular, we'd like to say a massive thank you to OpenFaaS's project lead, Alex Ellis for his continued help and support. He's been a great mentor, giving us invaluable advice on various different aspects varying from presentation guidance to OpenFaaS setup and configuration. He's even written a neat function which normalises the images to help with the sepia effect which has given great results.
Alex has just launched the idea of "Pods" in the OpenFaaS community. This is an idea to help manage the work on the project and decentralise leadership. Alex writes:
- 7 +/- 2 people
- Mixed ability
- Gravitate to areas of the project that interest them the most
- Get special mentions in GitHub issues, PRs and release notes
- Work collaboratively or independently
- Are represented by a Pod Advocate who is willing to commit to give time to the Pod and the project
I'm really excited about this idea and look forward to joining a pod!
The OpenFaaS community is very inclusive and friendly, if you're interested in getting involved just ping an email to firstname.lastname@example.org and he'll get you added to the Slack group.
Feel free to have a play with @colorisebot on twitter and let us know what you think!
The code is all open source (checkout the original post here) so take a look and have a hack.
I've got a cool little grafana dashboard linked up to the OpenFaaS Prometheus metrics so I can see what's going on in real time. Incidentally, this weekend has been a very busy period for the bot, racking up over 1130 colourised images on Twitter! I've put together a collection of the best here: https://storify.com/developius/best-of-colorisebot
In the future we'd like to run the conversions on GPUs so that we can leverage the power of graphics processors to run our network even faster. We think we can decrease execution time by a factor of 100 (5s -> 50ms) by using a GPU.
We'd also like to try running our network on a recurrent network, which learns from the frames that came before the current one. This should help with the video conversion which appears to flicker because some of the frames are slightly incorrect.
Our slides are available here and you can watch our DockerCon talk here. Oli has also written a post about the ins and outs (pun intended) of machine learning and also goes over how we adapted our program for video.
As software gets more and more integrated into our lives, the industrialization of its crafting process becomes inevitable. But the over-generalization of software engineering can be crushing the creative side of programming.