DEV Community: James Thomas

Faster File Transfers With Serverless

James Thomas — Thu, 29 Aug 2019 10:42:13 +0000

This week I've been helping a client speed up file transfers between cloud object stores using serverless.

They had a 120GB file on a cloud provider's object store. This needed copying into a different cloud object store for integration with platform services. Their current file transfer process was to download the file locally and then re-upload using a development machine. This was taking close to three hours due to bandwidth issues.

Having heard about the capabilities of serverless cloud platforms, they were wondering if they could use the massive parallelism that serverless provides to speed up that process? 🤔

After some investigating, I worked out a way to use serverless to implement concurrent file transfers. Transfer time was reduced from THREE HOURS to just FOUR MINUTES! This was a decrease in total transfer time of 98%. 👏👏👏

In this blog post, I'll outlined the simple steps I used to make this happen. I've been using IBM Cloud Functions as the serverless platform. Two different S3-compatible Object Stores were used for the file transfers. The approach should work for any object store with the features outlined below.

S3-Compatible API Features

Both object stores being used for the file transfers provided an S3-compatible API. The S3 API has two features that, when combined, enable concurrent file transfers: Range Reads and Multi-Part Transfers.

Range Reads

The HTTP/1.1 protocol defines a Range header which allows the client to retrieve part of a document. The client specifies a byte range using the header value, e.g. Range: bytes=0-499. The byte values are then returned in the HTTP response with a HTTP 206 status code. If the byte range is invalid, a HTTP 416 response is returned.

The S3 API supports Range request headers on GET HTTP requests for object store files.

Sending a HTTP HEAD request for an object store file will return the file size (using the Content-Length header value). Creating ranges for fixed byte chunks up to this file size (0-1023, 1024-2047,2048-3072 ...) allows all sections of a file to be retrieve in parallel.

Multi-Part Transfers

Files are uploaded to buckets using HTTP PUT requests. These operations supports a maximum file size of 5GB. Uploading larger files is only possible using "Multi-Part" transfers.

Clients initiate a multi-part transfer using the API and are returned an upload identifier. The large file is then split into parts which are uploaded using individual HTTP PUT requests. The upload identifier is used to tags individual requests as belonging to the same file. Once all parts have been uploaded, the API is used to confirm the file is finished.

File parts do not have to be uploaded in consecutive order and multiple parts can be uploaded simultaneously.

Serverless File Transfers

Combing these two features, I was able to create a serverless function to copy a part of a file between source and destination buckets. By invoking thousands of these functions in parallel, the entire file could be simultaneously copied in parallel streams between buckets. This was controlled by a local script used to manage the function invocations, monitor progress and complete the multi-part transfer once invocations had finished.

Serverless Function

The serverless function copies a file part between object stores. It is invoked with all the parameters needed to access both bucket files, byte range to copy and multi-part transfer identifier.

exports.main = async function main (params) {
  const { src_bucket, src_file, range, dest_bucket, dest_file, mpu, index} = params
  const byte_range = await read_range(src_bucket, src_file, range)
  const upload_result = await upload_part(dest_bucket, dest_file, mpu, index, byte_range)
  return upload_result
}

Read Source File Part

The S3-API JS client can create a "Range Read" request by passing the Range parameter with the byte range value, e.g. bytes=0-NN.

const read_range = async (Bucket, Key, Range) => {
  const file_range = await s3.getObject({Bucket, Key, Range}).promise()
  return file_range.Body
}

Upload File Part

The uploadPart method is used to complete a part of a multi-part transfer. The method needs the UploadID created when initiating the multi-part transfer and the PartNumber for the chunk index. ETags for the uploaded content will be returned.

const upload_part = async (Bucket, Key, UploadId, PartNumber, Body) => {
  const result = await s3.uploadPart({Bucket, Key, UploadId, PartNumber, Body}).promise()
  return result
}

Note: The uploadPart method does not support streaming Body values unless they come from the filesystem. This means the entire part has to be read into memory before uploading. The serverless function must have enough memory to handle this.

Local Script

The local script used to invoke the functions has to do the following things...

Create and complete the multi-part transfer
Calculate file part byte ranges for function input parameters
Copy file parts using concurrent functions invocations.

Create Multi-Part Transfers

The S3-API JS client can be used to create a new Multi-Part Transfer.

const { UploadId } = await s3.createMultipartUpload({Bucket: '...', Key: '...'}).promise()

The UploadId can then be used as an input parameter to the serverless function.

Create Byte Ranges

Source file sizes can be retrieved using the client library.

const file_size = async (Bucket, Key) => {
  const { ContentLength } = await s3.headObject({Bucket, Key}).promise()
  return ContentLength
}

This file size needs splitting into consecutive byte ranges of fixed size chunks. This function will return an array of the HTTP Range header values (bytes=N-M) needed.

const split_into_ranges = (bytes, range_mbs) => {
  const range_size = range_mbs * 1024 * 1024
  const ranges = []
  let range_offset = 0
  const last_byte_range = bytes - 1

  while(range_offset < last_byte_range) {
    const start = range_offset
    // Last byte range may be less than chunk size where file size
    // is not an exact multiple of the chunk size.
    const end = start + Math.min((range_size - 1), last_byte_range - start)
    ranges.push(`bytes=${start}-${end}`)
    range_offset += range_size
  }

  return ranges
}

Invoke Concurrent Functions

Serverless functions need to be invoked for each byte range calculated above. Depending on the file and chunk sizes used, the number of invocations needed could be larger than the platform's concurrency rate limit (defaults to 1000 on IBM Cloud Functions). In the example above (120GB file in 100MB chunks), 1229 invocations would be needed.

Rather than executing all the byte ranges at once, the script needs to use a maximum of 1000 concurrent invocations. When initial invocations finish, additional functions can be invoked until all the byte ranges have been processed. This code snippet shows a solution to this issue (using IBM Cloud Functions JS SDK).

const parallel = require('async-await-parallel');
const retry = require('async-retry');
const openwhisk = require('openwhisk');

const concurrent = 1000
const retries = 3
const chunk_size = 100

const static_params = {
  source_bucket, dest_bucket, source_filename, dest_filename, mpu
}

const ow = openwhisk({...});

const bucket_file_size = await file_size(source_bucket, source_filename);
const ranges = split_into_ranges(bucket_file_size, chunk_size);

const uploads = ranges.map((range, index) => {
  const invoke = async () => {
    const params = Object.assign({range, index: index + 1}, static_params)
    const upload_result = await ow.actions.invoke({
      name: '...', blocking: true, result: true, params
    })
    return upload_result
  }

  return async () => retry(invoke, retries)
})

const finished = await parallel(uploads, concurrent)

The uploads value is an array of lazily evaluated serverless function invocations. The code snippet uses the async-await-parallel library to limit the number of concurrent invocations. Handling intermittent or erroneous invocation errors is managed using the async-retry library. Failed invocations will be retried three times.

Finish Multi-Part Transfer

Once all parts have been uploaded, ETags (returned from the serverless invocations) and the Part Numbers are used to complete the multi-part transfer.

const parts = finished.map((part, idx) => {
  part.PartNumber = idx + 1
  return part
})

const { Location, Bucket, Key, ETag } = await s3.completeMultipartUpload({
  Bucket: '...', Key: '...', UploadId: '...', MultipartUpload: { Parts }
}).promise()

Results

The previous file transfer process (download locally and re-upload from development machine) was taking close to three hours. This was an average throughput rate of 1.33MB/s ((120GB * 2) / 180).

Using serverless functions, the entire process was completed in FOUR MINUTES. File chunks of 100MB were transferred in parallel using 1229 function invocations. This was an average throughput rate of 60MB/s. That was a reduction in total transfer time of ~98%. 💯💯💯

Serverless makes it incredibly easy to run embarrassingly parallel workloads in the cloud. With just a few lines of code, the file transfer process can be parallelised using 1000s of concurrent functions. The client was rather impressed as you can imagine... 😎

Serverless Functions With WebAssembly Modules

James Thomas — Thu, 08 Aug 2019 09:50:20 +0000

Watching a recent talk by Lin Clark and Till Schneidereit about WebAssembly (Wasm) inspired me to start experimenting with using WebAssembly modules from serverless functions.

This blog post demonstrates how to invoke functions written in C from Node.js serverless functions. Source code in C is compiled to Wasm modules and bundled in the deployment package. Node.js code implements the serverless platform handler and calls native functions upon invocations.

The examples should work (with some modifications) on any serverless platform that supports deploying Node.js functions from a zip file. I'll be using IBM Cloud Functions (Apache OpenWhisk).

WebAssembly

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable target for compilation of high-level languages like C/C++/Rust.

https://webassembly.org/

Wasm started as a project to run low-level languages in the browser. This was envisioned as a way to execute computationally intensive tasks in the client, e.g. image manipulation, machine learning, graphics engines. This would improve performance for those tasks compared to using JavaScript.

WebAssembly compiles languages like C, C++ and Rust to a portable instruction format, rather than platform-specific machine code. Compiled Wasm files are interpreted by a Wasm VM in the browser or other runtimes. APIs have been defined to support importing and executing Wasm modules from JavaScript runtimes. These APIs have been implemented in multiple browsers and recent Node.js versions (v8.0.0+).

This means Node.js serverless functions, using a runtime version above 8.0.0, can use WebAssembly!

Wasm Modules + Serverless

"Why would we want to use WebAssembly Modules from Node.js Serverless Functions?" 🤔

Performance

Time is literally money with serverless platforms. The faster the code executes, the less it will cost. Using C, C++ or Rust code, compiled to Wasm modules, for computationally intensive tasks can be much faster than the same algorithms implemented in JavaScript.

Easier use of native libraries

Node.js already has a way to use native libraries (in C or C++) from the runtime. This works by compiling the native code during the NPM installation process. Libraries bundled in deployment packages need to be compiled for the serverless platform runtime, not the development environment.

Developers often resort to using specialised containers or VMs, that try to match the runtime environments, for library compilation. This process is error-prone, difficult to debug and a source of problems for developers new to serverless.

Wasm is deliberately platform independent. This means Wasm code compiled locally will work on any Wasm runtime. No more worrying about platform architectures and complex toolchains for native libraries!

Additional runtime support

Dozens of languages now support compiling to WebAssembly.

Want to write serverless functions in Rust, C, or Lua? No problem! By wrapping Wasm modules with a small Node.js handler function, developers can write their serverless applications in any language with "compile to Wasm" support.

Developers don't have to be restricted to the runtimes provided by the platform.

JS APIs in Node.js

Here is the code needed to load a Wasm module from Node.js. Wasm modules are distributed in .wasm files. Loaded modules are instantiated into instances, by providing a configurable runtime environment. Functions exported from Wasm modules can then be invoked on these instances from Node.js.

const wasm_module = 'library.wasm'
const bytes = fs.readFileSync(wasm_module)
const wasmModule = new WebAssembly.Module(bytes);
const wasmMemory = new WebAssembly.Memory({initial: 512});
const wasmInstance = new WebAssembly.Instance(wasmModule, { env: { memory: wasmMemory } }})

Calling Functions

Exported Wasm functions are available on the exports property of the wasmInstance. These properties can be invoked as normal functions.

const result = wasmInstance.exports.add(2, 2)

Passing & Returning Values

Exported Wasm functions can only receive and return native Wasm types. This (currently) means only integers.

Values that can be represented as a series of numbers, e.g. strings or arrays, can be written directly to the Wasm instance memory heap from Node.js. Heap memory references can be passed as the function parameter values, allowing the Wasm code to read these values. More complex types (e.g. JS objects) are not supported.

This process can also be used in reverse, with Wasm functions returning heap references to pass back strings or arrays with the function result.

For more details on how memory works in Web Assembly, please see this page.

Examples

Having covered the basics, let's look at some examples...

I'll start with calling a simple C function from a Node.js serverless function. This will demonstrate the complete steps needed to compile and use a small C program as a Wasm module. Then I'll look at a more real-world use-case, dynamic image resizing. This will use a C library compiled to Wasm to improve performance.

Examples will be deployed to IBM Cloud Functions (Apache OpenWhisk). They should work on other serverless platforms (supporting the Node.js runtime) with small modifications to the handler function's interface.

Simple Function Calls

Create Source Files

Create a file add.c with the following contents:

int add(int a, int b) {
  return a + b;
}

Create a file (index.js) with the following contents:

'use strict';
const fs = require('fs');
const util = require('util')

const WASM_MODULE = 'add.wasm'
let wasm_instance 

async function load_wasm(wasm_module) {
  if (!wasm_instance) {
    const bytes = fs.readFileSync(wasm_module);
    const memory = new WebAssembly.Memory({initial: 1});
    const env = {
      __memory_base: 0, memory
    }

    const { instance, module } = await WebAssembly.instantiate(bytes, { env });
    wasm_instance = instance
  }

  return wasm_instance.exports._add
}

exports.main = async function ({ a = 1, b = 1 }) {
  const add = await load_wasm(WASM_MODULE)
  const sum = add(a, b)
  return { sum }
}

Create a file (package.json) with the following contents:

{
  "name": "wasm",
  "version": "1.0.0",
  "main": "index.js"
}

Compile Wasm Module

This C source file needs compiling to a WebAssembly module. There are different projects to handle this. I will be using Emscripten, which uses LLVM to compile C and C++ to WebAssembly.

Install the Emscripten toolchain.
Run the following command to generate the Wasm module.

emcc -s WASM=1 -s SIDE_MODULE=1 -s EXPORTED_FUNCTIONS="['_add']" -O1 add.c -o add.wasm

The SIDE_MODULE option tells the compiler the Wasm module will be loaded manually using the JS APIs. This stops Emscripten generating a corresponding JS file to do this automatically. Functions exposed on the Wasm module are controlled by the EXPORTED_FUNCTIONS configuration parameter.

Deploy Serverless Function

Create deployment package with source files.

zip action.zip index.js add.wasm package.json

Create serverless function from deployment package.

ibmcloud wsk action create wasm action.zip --kind nodejs:10

Invoke serverless function to test Wasm module.

$ ibmcloud wsk action invoke wasm -r -p a 2 -p b 2
{
    "sum": 4
}

It works! 🎉🎉🎉

Whilst this is a trivial example, it demonstrates the workflow needed to compile C source files to Wasm modules and invoke exported functions from Node.js serverless functions. Let's move onto a more realistic example...

Dynamic Image Resizing

This repository contains a serverless function to resize images using a C library called via WebAssembly. It is a fork of the original code created by Cloudflare for their Workers platform. See the original repository for details on what the repository contains and how the files work.

Checkout Repository

Retrieve the source files by checking out this repository.

git clone https://github.com/jthomas/openwhisk-image-resize-wasm

This repository contains the pre-compiled Wasm module (resize.wasm) needed to resize images using the stb library. The module exposes two functions: init and resize.

The init function returns a heap reference to write the image bytes for processing into. The resize function is called with two values, the image byte array length and new width value. It uses these values to read the image bytes from the heap and calls the library functions to resize the image to the desired width. Resized image bytes are written back to the heap and the new byte array length is returned.

Deploy Serverless Function

Create deployment package from source files.

zip action.zip resizer.wasm package.json worker.js

Create serverless function from deployment package.

ibmcloud wsk action update resizer action.zip --kind nodejs:10 --web true

Retrieve HTTP URL for Web Action.

ibmcloud wsk action get resizer --url

This should return a URL like: https://<region>.cloud.ibm.com/api/v1/web/<ns>/default/resizer

Open the Web Action URL with the .http extension.

https://<region>.cloud.ibm.com/api/v1/web/<ns>/default/resizer.http

This should return the following image resized to 250 pixels (from 900 pixels).

URL query parameters (url and width) can be used to modify the image source or output width for the next image, e.g.

https://<region>.cloud.ibm.com/api/v1/web/<ns>/default/resizer.http?url=<IMG_URL>&width=500

Conclusion

WebAssembly may have started as a way to run native code in the browser, but soon expanded to server-side runtime environments like Node.js. WebAssembly modules are supported on any serverless platform with a Node.js v8.0.0+ runtime.

Wasm provides a fast, safe and secure way to ship portable modules from compiled languages. Developers don't have to worry about whether the module is compiled for the correct platform architecture or linked against unavailable dynamic libraries. This is especially useful for serverless functions in Node.js, where compiling native libraries for production runtimes can be challenging.

Wasm modules can be used to improve performance for computationally intensive calculations, which lowers invocation times and, therefore, costs less. It also provides an easy way to utilise additional runtimes on serverless platforms without any changes by the platform provider.

Hosting Static Websites on IBM Cloud

James Thomas — Tue, 30 Jul 2019 14:18:02 +0000

This blog post explains how to host a static website on IBM Cloud. These websites are rendered client-side by the browser from static assets, like HTML, CSS and JS files. They do not need a server-side component to create pages dynamically at runtime. Static websites are often combined with backend APIs to create Single Page Applications.

Hosting static websites on IBM Cloud uses Cloud Object Storage (COS) and Cloud Internet Services (CIS) (with Page Rules and Edge Functions). These services provide the following features needed to serve static websites.

Auto-serving static assets from provider-managed HTTP service (Cloud Object Storage).
Custom domain support to serve content from user-controlled domain name (CIS - Page Rules).
Configurable Index and Error documents (CIS - Edge Functions).

Here are the steps needed to host a static website on IBM Cloud by combining those services.

Serving static assets

IBM Cloud Object Storage is a scalable storage solution for cloud applications. Files are managed through a RESTful HTTP API and stored in user-defined collections called "buckets". Bucket files are returned as HTTP responses from HTTP GET requests.

COS supports an optional "anonymous read-only access" setting for buckets. This means all files in the bucket will be accessible using anonymous HTTP GET requests.

Putting HTML, CSS and JS files in a public bucket allows static websites to be served directly by COS. Users are charged for bandwidth used and HTTP requests received for all bucket files.

Create IBM Cloud Object Storage instance

If you already have an instance of Cloud Object Storage you can skip this step...

Provision a new instance of IBM Cloud Object Storage

Create IBM Cloud Object Storage Bucket

Open the COS instance from the Resource List.
Create a new COS bucket to host the static site files.
- Choose a Bucket name
- Choose the Resiliency, Location and Storage Class options for the bucket.

Any choices for these options can be used - it does not affect the static site hosting capability. For more details on what they mean, please see this documentation.

Upload Static Assets To Bucket

Upload static file assets to the new bucket.

Enable Public Access to bucket files

Click the "Access Policies" menu item from the bucket level menu.
Click the "Public Access" tab from the bucket access policy page.
Check the Access Group drop-down has "Public Access" option selected.
Click the "Create access policy" and then "Enable" on the pop menu.

Check bucket files are accessible

Bucket files should now be accessible using the service endpoint URL, bucket id and file names. COS supports providing the bucket name in the URL path or a sub-domain on the service endpoint.

Open the "Configuration" panel on the bucket page.
Retrieve the public endpoint shown, e.g. s3.<REGION>.cloud-object-storage.appdomain.cloud

Bucket files (like index.html) should now be accessible by a web browser. COS supports both HTTP and HTTPS traffic. Bucket files are available using the following URLs.

vhost addressing

<BUCKET_NANME>.s3.eu-gb.cloud-object-storage.appdomain.cloud/index.html

url path addressing

s3.<REGION>.cloud-object-storage.appdomain.cloud/<BUCKET_NANME>/index.html

Bucket files can now be referenced directly in external web applications. COS buckets are often used to store large application assets like videos or images. For hosting an entire website, it is often necessary to serve content from a custom domain name, rather than the COS bucket hostname.

Custom domain support

Cloud Internet Services Page Rules can automatically configure custom domain support for COS buckets.

CNAME DNS records are created to alias the custom domain to the COS bucket hostname. All traffic to the custom domain will then be forwarded to the COS service.

When COS serves files from bucket sub-domains, the HTTP Host request header value to determine the bucket name. With CNAME DNS records, this header value will still refer to the custom domain, rather than the bucket sub-domain. This field needs to be dynamically updated with the correct value.

Create IBM Cloud Internet Services instance

Provision a new instance of Cloud Internet Services.

Register Custom Domain name with Cloud Internet Services

Follow the documentation on how to register a custom domain with Cloud Internet Services.

This process involves delegating name server control for the domain over to IBM Cloud Internet Services.

Configure Page Rules and DNS records (automatic)

Cloud Internet Services can automatically set up Page Rules and DNS records needed to forward custom domain traffic to COS buckets. This automatically exposes the bucket as bucket-name.your-domain.com. If you want to change this default sub-domain name, follow the manual steps in the next section.

Click the Performance drop-down menu and click the "Page Rules" link.
Click the "Create rule" button from the table.
Select the Rule Behaviour Setting as "Resolve Override with COS"
Select the correct COS instance and bucket.
Click the "Create" button.

Once DNS records have propagated, bucket files should be accessible using the custom domain: http(s)://<CUSTOM_DOMAIN>/index.html.

Configure Page Rules and DNS records (manual)

These steps only need following if you haven't done the section above….

Create the Page Rule to modify the HTTP host header.

Click the Performance drop-down menu and select the "Page Rules" link.
Click the "Create rule" button from the table.
Set the URL match field to be <SUB_DOMAIN>.<CUSTOM_DOMAIN>/*
Select the Rule Behaviour Setting as "Host Header Override" as the custom bucket sub-domain:<BUCKET_NANME>.<REGION>.eu-gb.cloud-object-storage.appdomain.cloud

Create the DNS CNAME record to forward traffic to COS.

Click the Reliability drop-down menu and click the "DNS" menu entry.
Add a new DNS record with the following values.
- Type: CNAME
- Name: <custom subdomain host>
- TTL: Automatic
- Alias Domain Name: <COS bucket sub-domain>

Name is the sub-domain on the custom domain (e.g. www) through which the COS bucket will be accessible. Alias Domain Name is the COS bucket sub-domain from above, e.g. <BUCKET_NANME>.<REGION>.eu-gb.cloud-object-storage.appdomain.cloud

Once the record is added, set the Proxy field to true. This is necessary for the page rules to work.

Once DNS records have propagated, bucket files should be accessible using the custom domain.

Configurable Index and Error pages

COS will now serve static assets from a custom sub-domain, where file names are explicitly included in the URL, e.g. http(s)://<CUSTOM_DOMAIN>/index.html. This works fine for static websites with two exceptions, the default document for the web site and the error page.

When a user visits the COS bucket sub-domain without an explicit file path (http(s)://<CUSTOM_DOMAIN>), the COS service will return the bucket file list, rather than the site index page. Additionally, if a user requests a missing file, COS returns an XML error message rather than a custom error page.

Both issues can be resolved using Edge Functions, a new feature in Cloud Internet Services.

Edge Functions

Edge functions are JavaScript source files deployed to Cloudflare's Edge locations. They can dynamically modify HTTP traffic passing through Cloudflare's network (for domains you control). Custom edge functions are triggered on configurable URL routes. Functions are passed the incoming HTTP request and control the HTTP response returned.

Add Edge Function to provide Index & Error Documents

Using a custom edge function, HTTP traffic to the custom sub-domain can be modified to support Index and Error documents. Incoming HTTP requests without an explicit file name can be changed to use the index page location. HTTP 404 responses returned from COS can be replaced with a custom error page.

Open the "Edge Functions" page from the Cloud Internet Services instance homepage.
Click the "Create" icon on the "Actions" tab.
Enter "route-index-and-errors" in the action name field.
Paste the following source code into the action body section.

The INDEX_DOCUMENT and ERROR_DOCUMENT values control the index and error pages used to redirect requests. Replace these values with the correct page locations for the static site being hosted.

const INDEX_DOCUMENT = 'index.html'
const ERROR_DOCUMENT = '404.html'

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const url = new URL(request.url)

  // if request is a directory path, append the index document.
  if (url.pathname.endsWith('/')) {
    url.pathname = `${url.pathname}${INDEX_DOCUMENT}`
    request = new Request(url, request)
  }

  let response = await fetch(request)

  // if bucket file is missing, return error page.
  if (response.status === 404) {    
    url.pathname = ERROR_DOCUMENT
    request = new Request(url, request)
    response = await fetch(request)

    response = new Response(response.body, {
      status: 404,
      statusText: 'Not Found',
      headers: response.headers
    })      
  } 

  return response
}

Click the "Save" button.

Set up Triggers for Edge Function

Select the "Triggers" panel from the Edge Functions page.
Click the "Add trigger" icon.
Set the Trigger URL to http://<SUB_DOMAIN>.<CUSTOM_DOMAIN>/*.
Select the "route-index-and-errors" action from the drop-down menu.
Click the "Save" button.

Test Index and Error Pages

Having set up the trigger and edge function, HTTP requests to the root path on the custom sub-domain will return the index page. Accessing invalid bucket files will also return the error page, rather than the COS error response.

Confirm that http://<SUB_DOMAIN>.<CUSTOM_DOMAIN>/ returns the same page as http://<SUB_DOMAIN>.<CUSTOM_DOMAIN>/index.html
Confirm that http://<SUB_DOMAIN>.<CUSTOM_DOMAIN>/missing-page.html returns the error page. This should be different to the XML error response returned by visiting <BUCKET_NANME>.s3.<REGION>.cloud-object-storage.appdomain.cloud/missing-page.html.

If this all works - the site is working! IBM Cloud is now hosting a static website using Cloud Object Storage and Cloud Internet Services with Page Rules and Edge Functions. 🎉🎉🎉

Summary

Static web sites can be hosted on IBM Cloud using Cloud Object Storage and Cloud Internet Services.

Cloud Object stores page files needed to render the static website. Anonymous bucket file access means files are accessible as public HTTP endpoints, without having to run infrastructure to serve the assets.

Cloud Internet Services forwards HTTP traffic from a custom domain to the bucket hostname. DNS CNAME records are used to resolve the sub-domain as the custom bucket hostname. Page Rules override HTTP request headers to make this work. Edge Functions are used to implement configurable Index and Error documents, by dynamically modifying in-flight requests with custom JavaScript.

Hosting static web sites using this method can be much cheaper (and easier) than traditional infrastructure. Developers only get charged for actual site usage, based on bandwidth and HTTP requests.

Connecting to IBM Cloud Databases for Redis from Node.js

James Thomas — Mon, 22 Jul 2019 14:48:13 +0000

This blog post explains how to connect to an IBM Cloud Databases for Redis instance from a Node.js application. There is a (small) difference between the connection details needed for an IBM Cloud Databases for Redis instance compared to a local instance of the open-source database. This is due to all IBM Cloud Databases using secured TLS connections with self-signed certificates.

I keep running into this issue (and forgetting how to fix it 🤦‍♂️), so I'm documenting the solution here to help myself (and others) who might run into it… 🦸‍♂️

Connecting to Redis (without TLS connections)

Most Node.js application use the redis NPM library to interact with an instance of the database. This library has a createClient method which returns an instance of the client. The Node.js application passes a connection string into the createClient method. This string contains the hostname, port, username and password for the database instance.

const redis = require("redis"),
const url = 'redis://user:secret@localhost:6379/'
const client = redis.createClient(url);

The client fires a connect event once the connection is established or an error event if issues are encountered.

IBM Cloud Databases for Redis Service Credentials

IBM Cloud Databases for Redis provide service credentials through the instance management console. Service credentials are JSON objects with connection properties for client libraries, the CLI and other tools. Connection strings for the Node.js client library are available in the connection.rediss.composed field.

So, I just copy this field value and use with the redis.createClient method? Not so fast...

IBM Cloud Databases for Redis uses TLS to secure all connections to the Redis instances. This is denoted by the connection string using the rediss:// URL prefix, rather than redis://. Using that connection string (without further connection properties), will lead to the following error being thrown by the Node.js application.

Error: Redis connection to <id>.databases.appdomain.cloud:port failed - read ECONNRESET
  at TCP.onread (net.js:657:25) errno: 'ECONNRESET', code: 'ECONNRESET', syscall: 'read'

If the createClient forces a TLS connection to be used createClient(url, { tls: {} }), this error will be replaced with a different one about self-signed certificates.

Error: Redis connection to <id>.databases.appdomain.cloud:port failed failed - self signed certificate in certificate chain
    at TLSSocket.onConnectSecure (_tls_wrap.js:1055:34)
    at TLSSocket.emit (events.js:182:13)
    at TLSSocket._finishInit (_tls_wrap.js:635:8) code: 'SELF_SIGNED_CERT_IN_CHAIN'

Hmmmm, how to fix this? 🤔

Connecting to Redis (with TLS connections)

All connections to IBM Cloud Databases are secured with TLS using self-signed certificates. Public certificates for the signing authorities are provided as Base64 strings in the service credentials. These certificates can be provided in the client constructor to support self-signed TLS connections.

Here are the steps needed to use those self-signed certificates with the client library...

Extract the connection.rediss.certificate.certificate_base64 value from the service credentials.

Decode the Base64 string in Node.js to extract the PEM certificate string.

const ca = Buffer.from(cert_base64, 'base64').toString('utf-8')

Provide the certificate file string as the ca property in the tls object for the client constructor.

const tls = { ca };
const client = redis.createClient(url, { tls });

…Relax! 😎

The tls property is passed through to the tls.connect method in Node.js, which is used to setup the TLS connection. This method supports a ca parameter to extend the trusted CA certificates pre-installed in the system. By providing the self-signed certificate using this property, the errors above will not be seen.

Conclusion

It took me a while to work out how to connect to TLS-secured Redis instances from a Node.js application. Providing the self-signed certificate in the client constructor is a much better solution than having to disable all unauthorised TLS connections!

Since I don't write new Redis client code very often, I keep forgetting the correct constructor parameters to make this work. Turning this solution into a blog post will (hopefully) embed it in my brain (or at least provide a way to find the answer instead of having to grep through old project code). This might even be useful to others Googling for a solution to those error messages...

Serverless APIs for Machine Learning models

James Thomas — Wed, 03 Jul 2019 13:19:50 +0000

IBM's Model Asset eXchange provides a curated list of free Machine Learning models for developers. Models currently published include detecting emotions or ages in faces from images, forecasting the weather, converting speech to text and more. Models are pre-trained and ready for use in the cloud.

Models are published as series of public Docker images. Images automatically expose a HTTP API for model predictions. Documentation in the model repositories explains how to run images locally (using Docker) or deploy to the cloud (using Kubernetes). This got me thinking…

Could MAX models be used from serverless functions? 🤔

Running machine learning models on serverless platforms can take advantage of the horizontal scalability to process large numbers of computationally intensive classification tasks in parallel. Coupled with the serverless pricing structure ("no charge for idle"), this can be an extremely cheap and effective way to perform model classifications in the cloud.

CHALLENGE ACCEPTED! 🦸‍♂️🦸‍♀️

After a couple days of experimentation, I had worked out an easy way to automatically expose MAX models as Serverless APIs on IBM Cloud Functions. 🎉🎉🎉

I've given instructions below on how to create those APIs from the models using a simple script. If you just want to use the models, follow those instructions. If you are interested in understanding how this works, keep reading as I explain afterwards what I did...

Running MAX models on IBM Cloud Functions

This repository contains a bash script which builds custom Docker runtimes with MAX models for usage on IBM Cloud Functions. Pushing these images to Docker Hub allows IBM Cloud Functions to use them as custom runtimes. Web Actions created from these custom runtime images expose the same Prediction API described in the model documentation. They can be used with no further changes or custom code needed.

prerequisites

Please follow the links below to set up the following tools before proceeding.

Check out the "Serverless MAX Models repository. Run all the following commands from that folder.

git clone https://github.com/jthomas/serverless-max-models 
cd serverless-max-models

build custom runtime images

Set the following environment variables (MODELS) with MAX model names and run build script.
- MODELS: MAX model names, e.g. max-facial-emotion-classifier
- USERNAME: Docker Hub username.

MODELS="..." USERNAME="..." ./build.sh

This will create Docker images locally with the MAX model names and push to Docker Hub for usage in IBM Cloud Functions. IBM Cloud Functions only supports public Docker images as custom runtimes.

create actions using custom runtimes

Create a Web Action using the custom Docker runtime.

ibmcloud wsk action create <MODEL_IMAGE> --docker <DOCKERHUB_NAME>/<MODEL_IMAGE> --web true -m 512

Retrieve the Web Action URL (https://<REGION>.functions.cloud.ibm.com/api/v1/web/<NS>/default/<ACTION>)

ibmcloud wsk action get <MODEL_IMAGE> --url

invoke web action url with prediction api parameters

Use the same API request parameters as defined in the Prediction API specification with the Web Action URL. This will invoke model predictions and return the result as the HTTP response, e.g.

curl -F "image=@assets/happy-baby.jpeg" -XPOST <WEB_ACTION_URL>

NOTE: The first invocation after creating an action may incur long cold-start delays due to the platform pulling the remote image into the local registry. Once the image is available in the platform, both further cold and warm invocations will be much faster.

Example

Here is an example of creating a serverless API using the max-facial-emotion-classifier MAX model. Further examples of models which have been tested are available here. If you encounter problems, please open an issue on Github.

max-facial-emotion-classifier

Facial Emotion Classifier (max-facial-emotion-classifier)

Start by creating the action using the custom runtime and then retrieve the Web Action URL.

$ ibmcloud wsk action create max-facial-emotion-classifier --docker <DOCKERHUB_NAME>/max-facial-emotion-classifier --web true -m 512
ok: created action max-facial-emotion-classifier
$ ibmcloud wsk action get max-facial-emotion-classifier --url
ok: got action max-facial-emotion-classifier
https://<REGION>.functions.cloud.ibm.com/api/v1/web/<NS>/default/max-facial-emotion-classifier

According to the API definition for this model, the prediction API expects a form submission with an image file to classify. Using a sample image from the model repo, the model can be tested using curl.

$ curl -F "image=@happy-baby.jpeg" -XPOST https://<REGION>.functions.cloud.ibm.com/api/v1/web/<NS>/default/max-facial-emotion-classifier

{
  "status": "ok",
  "predictions": [
    {
      "detection_box": [
        0.15102639296187684,
        0.3828125,
        0.5293255131964809,
        0.5830078125
      ],
      "emotion_predictions": [
        {
          "label_id": "1",
          "label": "happiness",
          "probability": 0.9860254526138306
        },
        ...
      ]
    }
  ]
}

performance

Example Invocation Duration (Cold): ~4.8 seconds

Example Invocation Duration (Warm): ~ 800 ms

How does this work?

background

Running machine learning classifications using pre-trained models from serverless functions has historically been challenging due to the following reason…

Developers do not control runtime environments in (most) serverless cloud platforms. Libraries and dependencies needed by the functions must be provided in the deployment package. Most platforms limit deployment package sizes (~50MB compressed & ~250MB uncompressed).

Machine Learning libraries and models can be much larger than those deployment size limits. This stops them being included in deployment packages. Loading files dynamically during invocations may be possible but incurs extremely long cold-start delays and additional costs.

Fortunately, IBM Cloud Functions is based on the open-source serverless project, Apache OpenWhisk. This platform supports bespoke function runtimes using custom Docker images. Machine learning libraries and models can therefore be provided in custom runtimes. This removes the need to include them in deployment packages or be loaded at runtime.

Interested in reading other blog posts about using machine learning libraries and toolkits with IBM Cloud Functions? See these posts for more details.

MAX model images

IBM's Model Asset eXchange publishes Docker images for each model, alongside the pre-trained model files. Images expose a HTTP API for predictions using the model on port 5000, built using Python and Flask. Swagger files for the APIs describe the available operations, input parameters and response bodies.

These images use a custom application framework (maxfw), based on Flask, to standardise exposing MAX models as HTTP APIs. This framework handles input parameter validation, response marshalling, CORS support, etc. This allows model runtimes to just implement the prediction API handlers, rather than the entire HTTP application.

Since the framework already handles exposing the model as a HTTP API, I started looking for a way to simulate an external HTTP request coming into the framework. If this was possible, I could trigger this fake request from a Python Web Action to perform the model classification from input parameters. The Web Action would then covert the HTTP response returned into the valid Web Action response parameters.

flask test client

Reading through the Flask documentation, I came across the perfect solution! 👏👏👏

Flask provides a way to test your application by exposing the Werkzeug test Client and handling the context locals for you. You can then use that with your favourite testing solution.

This allows application routes to be executed with the test client, without actually running the HTTP server.

max_app = MAXApp(API_TITLE, API_DESC, API_VERSION)
max_app.add_api(ModelPredictAPI, '/predict')
test_client = max_app.app.test_client()
r = test_client.post('/model/predict', data=content, headers=headers)

Using this code within a serverless Python function allows function invocations to trigger the prediction API. The serverless function only has to convert input parameters to the fake HTTP request and then serialise the response back to JSON.

python docker action

The custom MAX model runtime image needs to implement the HTTP API expected by Apache OpenWhisk. This API is used to instantiate the runtime environment and then pass in invocation parameters on each request. Since the runtime image contains all files and code need to process requests, the /init handler becomes a no-op. The /run handler converts Web Action HTTP parameters into the fake HTTP request.

Here is the Python script used to proxy incoming Web Actions requests to the framework model service.

from maxfw.core import MAXApp
from api import ModelPredictAPI
from config import API_TITLE, API_DESC, API_VERSION
import json
import base64
from flask import Flask, request, Response

max_app = MAXApp(API_TITLE, API_DESC, API_VERSION)
max_app.add_api(ModelPredictAPI, '/predict')

# Use flask test client to simulate HTTP requests for the prediction APIs
# HTTP request data will come from action invocation parameters, neat huh? :)
test_client = max_app.app.test_client()
app = Flask(__name__)

# This implements the Docker runtime API used by Apache OpenWhisk
# https://github.com/apache/incubator-openwhisk/blob/master/docs/actions-docker.md
# /init is a no-op as everything is provided in the image.
@app.route("/init", methods=['POST'])
def init():
    return ''

# Action invocation requests will be received as the `value` parameter in request body.
# Web Actions provide HTTP request parameters as `__ow_headers` & `__ow_body` parameters.
@app.route("/run", methods=['POST'])
def run():
    body = request.json
    form_body = body['value']['__ow_body']
    headers = body['value']['__ow_headers']

    # binary image content provided as base64 strings
    content = base64.b64decode(form_body)

    # send fake HTTP request to prediction API with invocation data
    r = test_client.post('/model/predict', data=content, headers=headers)
    r_headers = dict((x, y) for x, y in r.headers)

    # binary data must be encoded as base64 strings to return in JSON response
    is_image = r_headers['Content-Type'].startswith('image')
    r_data = base64.b64encode(r.data) if is_image else r.data
    body = r_data.decode("utf-8")

    response = {'headers': r_headers, 'status': r.status_code, 'body': body }
    print (r.status)
    return Response(json.dumps(response), status=200, mimetype='application/json')

app.run(host='0.0.0.0', port=8080)

building into an image

Since the MAX models already exist as public Docker images, those images can be used as base images when building custom runtimes. Those base images handle adding model files and all dependencies needed to execute them into the image.

This is the Dockerfile used by the build script to create the custom model image. The model parameter refers to the build argument containing the model name.

ARG model
FROM codait/${model}:latest

ADD openwhisk.py .

EXPOSE 8080

CMD python openwhisk.py

This is then used from the following build script to create a custom runtime image for the model.

#!/bin/bash

set -e -u

for model in $MODELS; do
  echo "Building $model runtime image"
  docker build -t $model --build-arg model=$model .
  echo "Pushing $model to Docker Hub"
  docker tag $model $USERNAME/$model
  docker push $USERNAME/$model
done

Once the image is published to Docker Hub, it can be referenced when creating new Web Actions (using the —docker parameter). 😎

ibmcloud wsk action create <MODEL_IMAGE> --docker <DOCKERHUB_NAME>/<MODEL_IMAGE> --web true -m 512

Conclusion

IBM's Model Asset eXchange is a curated collection of Machine Learning models, ready to deploy to the cloud for a variety of tasks. All models are available as a series of public Docker images. Models images automatically expose HTTP APIs for classifications.

Documentation in the model repositories explains how to run them locally and deploy using Kubernetes, but what about using on serverless cloud platforms? Serverless platforms are becoming a popular option for deploying Machine Learning models, due to horizontal scalability and cost advantages.

Looking through the source code for the model images, I discovered a mechanism to hook into the custom model framework used to export the model files as HTTP APIs. This allowed me write a simple wrapper script to proxy serverless function invocations to the model prediction APIs. API responses would be serialised back into the Web Action response format.

Building this script into a new Docker image, using the existing model image as the base image, created a new runtime which could be used on the platform. Web Actions created from this runtime image would automatically expose the same HTTP APIs as the existing image!

Saving Money and Time With Node.js Worker Threads in Serverless Functions

James Thomas — Fri, 10 May 2019 15:22:49 +0000

Node.js v12 was released last month. This new version includes support for Worker Threads, that are enabled by default. Node.js Worker Threads make it simple to execute JavaScript code in parallel using threads. 👏👏👏

This is useful for Node.js applications with CPU-intensive workloads. Using Worker Threads, JavaScript code can be executed code concurrently using multiple CPU cores. This reduces execution time compared to a non-Worker Threads version.

If serverless platforms provide Node.js v12 on multi-core environments, functions can use this feature to reduce execution time and, therefore, lower costs. Depending on the workload, functions can utilise all available CPU cores to parallelise work, rather than executing more functions concurrently. 💰💰💰

In this blog post, I'll explain how to use Worker Threads from a serverless function. I'll be using IBM Cloud Functions (Apache OpenWhisk) as the example platform but this approach is applicable for any serverless platform with Node.js v12 support and a multi-core CPU runtime environment.

Node.js v12 in IBM Cloud Functions (Apache OpenWhisk)

This section of the blog post is specifically about using the new Node.js v12 runtime on IBM Cloud Functions (powered by Apache OpenWhisk). If you are using a different serverless platform, feel free to skip ahead to the next section…

I've recently been working on adding the Node.js v12 runtime to Apache OpenWhisk.

Apache OpenWhisk uses Docker containers as runtime environments for serverless functions. All runtime images are maintained in separate repositories for each supported language, e.g. Node.js, Java, Python, etc. Runtime images are automatically built and pushed to Docker Hub when the repository is updated.

node.js v12 runtime image

Here is the PR used to add the new Node.js v12 runtime image to Apache OpenWhisk. This led to the following runtime image being exported to Docker Hub: openwhisk/action-nodejs-v12.

Having this image available as a native runtime in Apache OpenWhisk requires upstream changes to the project's runtime manifest. After this happens, developers will be able to use the --kind CLI flag to select this runtime version.

ibmcloud wsk action create action_name action.js --kind nodejs:12

IBM Cloud Functions is powered by Apache OpenWhisk. It will eventually pick up the upstream project changes to include this new runtime version. Until that happens, Docker support allows usage of this new runtime before it is built-in the platform.

ibmcloud wsk action create action_name action.js --docker openwhisk/action-nodejs-v12

example

This Apache OpenWhisk action returns the version of Node.js used in the runtime environment.

function main () {
  return {
    version: process.version
  }
}

Running this code on IBM Cloud Functions, using the Node.js v12 runtime image, allows us to confirm the new Node.js version is available.

$ ibmcloud wsk action create nodejs-v12 action.js --docker openwhisk/action-nodejs-v12
ok: created action nodejs-v12
$ ibmcloud wsk action invoke nodejs-v12 --result
{
    "version": "v12.1.0"
}

Worker Threads in Serverless Functions

This is a great introduction blog post to Workers Threads. It uses an example of generating prime numbers as the CPU intensive task to benchmark. Comparing the performance of the single-threaded version to multiple-threads - the performance is improved as a factor of the threads used (up to the number of CPU cores available).

This code can be ported to run in a serverless function. Running with different input values and thread counts will allow benchmarking of the performance improvement.

non-workers version

Here is the sample code for a serverless function to generate prime numbers. It does not use Worker Threads. It will run on the main event loop for the Node.js process. This means it will only utilise a single thread (and therefore single CPU core).

'use strict';

const min = 2

function main(params) {
  const { start, end } = params
  console.log(params)
  const primes = []
  let isPrime = true;
  for (let i = start; i < end; i++) {
    for (let j = min; j < Math.sqrt(end); j++) {
      if (i !== j && i%j === 0) {
        isPrime = false;
        break;
      }
    }
    if (isPrime) {
      primes.push(i);
    }
    isPrime = true;
  }

  return { primes }
}

porting the code to use worker threads

Here is the prime number calculation code which uses Worker Threads. Dividing the total input range by the number of Worker Threads generates individual thread input values. Worker Threads are spawned and passed chunked input ranges. Threads calculate primes and then send the result back to the parent thread.

Reviewing the code to start converting it to a serverless function, I realised there were two issues running this code in serverless environment: worker thread initialisation and optimal worker thread counts.

How to initialise Worker Threads?

This is how the existing source code initialises the Worker Threads.

 threads.add(new Worker(__filename, { workerData: { start: myStart, range }}));

__filename is a special global variable in Node.js which contains the currently executing script file path.

This means the Worker Thread will be initialised with a copy of the currently executing script. Node.js provides a special variable to indicate whether the script is executing in the parent or child thread. This can be used to branch script logic.

So, what's the issue with this?

In the Apache OpenWhisk Node.js runtime, action source files are dynamically imported into the runtime environment. The script used to start the Node.js runtime process is for the platform handler, not the action source files. This means the __filename variable does not point to the action source file.

This issue is fixed by separating the serverless function handler and worker thread code into separate files. Worker Threads can be started with a reference to the worker thread script source file, rather than the currently executing script name.

 threads.add(new Worker("./worker.js", { workerData: { start: myStart, range }}));

How Many Worker Threads?

The next issue to resolve is how many Worker Threads to use. In order to maximise parallel processing capacity, there should be a Worker Thread for each CPU core. This is the maximum number of threads that can run concurrently.

Node.js provides CPU information for the runtime environment using the os.cpus() function. The result is an array of objects (one per logical CPU core), with model information, processing speed and elapsed processing times. The length of this array will determine number of Worker Threads used. This ensures the number of Worker Threads will always match the CPU cores available.

const threadCount = os.cpus().length

workers threads version

Here is the serverless version of the prime number generation algorithm which uses Worker Threads.

The code is split over two files - primes-with-workers.js and worker.js.

primes-with-workers.js

This file contains the serverless function handler used by the platform. Input ranges (based on the min and max action parameters) are divided into chunks, based upon the number of Worker Threads. The handler function creates a Worker Thread for each chunk and waits for the message with the result. Once all the results have been retrieved, it returns all those primes numbers as the invocation result.

'use strict';

const { Worker } = require('worker_threads');
const os = require('os')
const threadCount = os.cpus().length

const compute_primes = async (start, range) => {
  return new Promise((resolve, reject) => {
    let primes = []
    console.log(`adding worker (${start} => ${start + range})`)
    const worker = new Worker('./worker.js', { workerData: { start, range }})

    worker.on('error', reject)
    worker.on('exit', () => resolve(primes))
    worker.on('message', msg => {
      primes = primes.concat(msg)
    })
  })
}

async function main(params) {
  const { min, max } = params
  const range = Math.ceil((max - min) / threadCount)
  let start = min < 2 ? 2 : min
  const workers = []

  console.log(`Calculating primes with ${threadCount} threads...`);

  for (let i = 0; i < threadCount - 1; i++) {
    const myStart = start
    workers.push(compute_primes(myStart, range))
    start += range
  }

  workers.push(compute_primes(start, max - start))

  const primes = await Promise.all(workers)
  return { primes: primes.flat() }
}

exports.main = main

workers.js

This is the script used in the Worker Thread. The workerData value is used to receive number ranges to search for prime numbers. Primes numbers are sent back to the parent thread using the postMessage function. Since this script is only used in the Worker Thread, it does need to use the isMainThread value to check if it is a child or parent process.

'use strict';
const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');

const min = 2

function generatePrimes(start, range) {
  const primes = []
  let isPrime = true;
  let end = start + range;
  for (let i = start; i < end; i++) {
    for (let j = min; j < Math.sqrt(end); j++) {
      if (i !== j && i%j === 0) {
        isPrime = false;
        break;
      }
    }
    if (isPrime) {
      primes.push(i);
    }
    isPrime = true;
  }

  return primes
}

const primes = generatePrimes(workerData.start, workerData.range);
parentPort.postMessage(primes)

package.json

Source files deployed from a zip file also need to include a package.json file in the archive. The main property is used to determine the script to import as the exported package module.

{
  "name": "worker_threads",
  "version": "1.0.0",
  "main": "primes-with-workers.js",
}

Performance Comparison

Running both functions with the same input parameters allows execution time comparison. The Worker Threads version should improve performance by a factor proportional to available CPU cores. Reducing execution time also means reduced costs in a serverless platform.

non-workers performance

Creating a new serverless function (primes) from the non-worker threads source code, using the Node.js v12 runtime, I can test with small values to check correctness.

$ ibmcloud wsk action create primes primes.js --docker openwhisk/action-nodejs-v12
ok: created action primes
$ ibmcloud wsk action invoke primes --result -p start 2 -p end 10
{
    "primes": [ 2, 3, 5, 7 ]
}

Playing with sample input values, 10,000,000 seems like a useful benchmark value. This takes long enough with the single-threaded version to benefit from parallelism.

$ time ibmcloud wsk action invoke primes --result -p start 2 -p end 10000000 > /dev/null

real    0m35.151s
user    0m0.840s
sys 0m0.315s

Using the simple single-threaded algorithm it takes the serverless function around ~35 seconds to calculate primes up to ten million.

workers threads performance

Creating a new serverless function, from the worker threads-based source code using the Node.js v12 runtime, allows me to verify it works as expected for small input values.

$ ibmcloud wsk action create primes-workers action.zip --docker openwhisk/action-nodejs-v12
ok: created action primes-workers
$ ibmcloud wsk action invoke primes-workers --result -p min 2 -p max 10
{
    "primes": [ 2, 3, 5, 7 ]
}

Hurrah, it works.

Invoking the function with an max parameter of 10,000,000 allows us to benchmark against the non-workers version of the code.

$ time ibmcloud wsk action invoke primes-workers --result -p min 2 -p max 10000000 --result > /dev/null

real    0m8.863s
user    0m0.804s
sys 0m0.302s

The workers versions only takes ~25% of the time of the single-threaded version!

This is because IBM Cloud Functions' runtime environments provide access to four CPU cores. Unlike other platforms, CPU cores are not tied to memory allocations. Utilising all available CPU cores concurrently allows the algorithm to run 4x times as fast. Since serverless platforms charge based on execution time, reducing execution time also means reducing costs.

The worker threads version also costs 75% less than the single-threaded version!

Conclusion

Node.js v12 was released in April 2019. This version included support for Worker Threads, that were enabled by default (rather than needing an optional runtime flag). Using multiple CPU cores in Node.js applications has never been easier!

Node.js applications with CPU-intensive workloads can utilise this feature to reduce execution time. Since serverless platforms charge based upon execution time, this is especially useful for Node.js serverless functions. Utilising multiple CPU cores leads, not only to improved performance, but also lower bills.

PRs have been opened to enable Node.js v12 as a built-in runtime to the Apache OpenWhisk project. This Docker image for the new runtime version is already available on Docker Hub. This means it can be used with any Apache OpenWhisk instance straight away!

Playing with Worker Threads on IBM Cloud Functions allowed me to demonstrate how to speed up performance for CPU-intensive workloads by utilising multiple cores concurrently. Using an example of prime number generation, calculating all primes up to ten million took ~35 seconds with a single thread and ~8 seconds with four threads. This represents a reduction in execution time and cost of 75%!

Serverless CI/CD with Travis CI, Serverless Framework and IBM Cloud Functions

James Thomas — Tue, 07 May 2019 10:34:46 +0000

How do you set up a CI/CD pipeline for serverless applications?

This blog post will explain how to use Travis CI, The Serverless Framework and the AVA testing framework to set up a fully-automated build, deploy and test pipeline for a serverless application. It will use a real example of a production serverless application, built using Apache OpenWhisk and running on IBM Cloud Functions. The CI/CD pipeline will execute the following tasks...

Run project unit tests.
Deploy application to test environment.
Run acceptance tests against test environment.
Deploy application to production environment.
Run smoke tests against production environment.

Before diving into the details of the CI/CD pipeline setup, let's start by showing the example serverless application being used for this project...

Serverless Project - http://apache.jamesthom.as/

The "Apache OpenWhisk Release Verification" project is a serverless web application to help committers verify release candidates for the open-source project. It automates running the verification steps from the ASF release checklist using serverless functions. Automating release candidate validation makes it easier for committers to participate in release voting.

The project consists of a static web assets (HTML, JS, CSS files) and HTTP APIs. Static web assets are hosted by Github Pages from the project repository. HTTP APIs are implemented as Apache OpenWhisk actions and exposed using the API Gateway service. IBM Cloud Functions is used to host the Apache OpenWhisk application.

No other cloud services, like databases, are needed by the backend. Release candidate information is retrieved in real-time by parsing the HTML page from the ASF website.

Configuration

The Serverless Framework (with the Apache OpenWhisk provider plugin) is used to define the serverless functions used in the application. HTTP endpoints are also defined in the YAML configuration file.

service: release-verfication

provider:
  name: openwhisk
  runtime: nodejs:10

functions:
  versions:
    handler: index.versions
    events:
      - http: GET /api/versions
  version_files:
    handler: index.version_files
    events:
      - http:
          method: GET
          path: /api/versions/{version}
          resp: http
...

plugins:
  - serverless-openwhisk

The framework handles all deployment and configuration tasks for the application. Setting up the application in a new environment is as simple as running the serverless deploy command.

Environments

Apache OpenWhisk uses namespaces to group individual packages, actions, triggers and rules. Different namespaces can be used to provide isolated environments for applications.

IBM Cloud Functions automatically creates user-based namespaces in platform instances. These auto-generated namespaces mirror the IBM Cloud organisation and space used to access the instance. Creating new spaces within an organisation will provision extra namespaces.

I'm using a custom organisation for the application with three different spaces: dev, test and prod.

dev is used as a test environment to deploy functions during development. test is used by the CI/CD pipeline to deploy a temporary instance of the application during acceptance tests. prod is the production environment hosting the external application actions.

Credentials

The IBM Cloud CLI is used to handle IBM Cloud Functions credentials. Platform API keys will be used to log in the CLI from the CI/CD system.

When Cloud Functions CLI commands are issued (after targeting a new region, organisation or space), API keys for that Cloud Functions instance are automatically retrieved and stored locally. The Serverless Framework knows how to use these local credentials when interacting with the platform.

High Availability?

The Apache OpenWhisk Release Verifier is not a critical cloud application which needs "five nines" of availability. The application is idle most of the time. It does not need a highly available serverless architecture. This means the build pipeline does not have to...

Deploy application instances in multiple cloud regions.
Set up a global load balancer between regional instances.
Support "zero downtime deploys" to minimise downtime during deployments.
Automatic roll-back to previous versions on production issues.

New deployments will simply overwrite resources in the production namespace in a single region. If the production site is broken after a deployment, the smoke tests should catch this and email me to fix it!

Testing

Given this tool will be used to check release candidates for the open-source project, I wanted to ensure it worked properly! Incorrect validation results could lead to invalid source archives being published.

I've chosen to rely heavily on unit tests to check the core business logic. These tests ensure all validation tasks work correctly, including PGP signature verification, cryptographic hash matching, LICENSE file contents and other ASF requirements for project releases.

Additionally, I've used end-to-end acceptance tests to validate the HTTP APIs work as expected. HTTP requests are sent to the API GW endpoints, with responses compared against expected values. All available release candidates are run through the validation process to check no errors are returned.

Unit Tests

Unit tests are implemented with the AVA testing framework. Unit tests live in the unit/test/ folder.

The npm test command alias runs the ava test/unit/ command to execute all unit tests. This command can be executed locally, during development, or from the CI/CD pipeline.

$ npm test

> release-verification@1.0.0 test ~/code/release-verification
> ava test/unit/

 27 tests passed

Acceptance Tests

Acceptance tests check API endpoints return the expected responses for valid (and invalid) requests. Acceptance tests are executed against the API Gateway endpoints for an application instance.

The hostname used for HTTP requests is controlled using an environment variable (HOST). Since the same test suite test is used for acceptance and smoke tests, setting this environment variable is the only configuration needed to run tests against different environments.

API endpoints in the test and production environments are exposed using different custom sub-domains (apache-api.jamesthom.as and apache-api-test.jamesthom.as). NPM scripts are used to provide commands (acceptance-test & acceptance-prod) which set the environment hostname before running the test suite.

"scripts": {
    "acceptance-test": "HOST=apache-api-test.jamesthom.as ava -v --fail-fast test/acceptance/",
    "acceptance-prod": "HOST=apache-api.jamesthom.as ava -v --fail-fast test/acceptance/"
  },

$ npm run acceptance-prod

> release-verification@1.0.0 acceptance-prod ~/code/release-verification
> HOST=apache-api.jamesthom.as ava -v --fail-fast  test/acceptance/

  ✔ should return list of release candidates (3.7s)
    ℹ running api testing against https://apache-api.jamesthom.as/api/versions
  ✔ should return 404 for file list when release candidate is invalid (2.1s)
    ℹ running api testing against https://apache-api.jamesthom.as/api/versions/unknown
  ...

  6 tests passed

Acceptance tests are also implemented with the AVA testing framework. All acceptance tests live in a single test file (unit/acceptance/api.js).

CI/CD Pipeline

When new commits are pushed to the master branch on the project repository, the following steps needed to be kicked off by the build pipeline…

Run project unit tests.
Deploy application to test environment.
Run acceptance tests against test environment.
Deploy application to production environment.
Run smoke tests against production environment.

If any of the steps fail, the build pipeline should stop and send me a notification email.

Travis

Travis CI is used to implement the CI/CD build pipeline. Travis CI uses a custom file (.travis.yml) in the project repository to configure the build pipeline. This YAML file defines commands to execute during each phase of build pipeline. If any of the commands fail, the build will stop at that phase without proceeding.

Here is the completed .travis.yml file for this project: https://github.com/jthomas/openwhisk-release-verification/blob/master/.travis.yml

I'm using the following Travis CI build phases to implement the pipeline: install, before_script, script, before_deploy and deploy. Commands will run in the Node.js 10 build environment, which pre-installs the language runtime and package manager.

language: node_js
node_js:
  - "10"

install

In the install phase, I need to set up the build environment to deploy the application and run tests.

This means installing the IBM Cloud CLI, Cloud Functions CLI plugin, The Serverless Framework (with Apache OpenWhisk plugin), application test framework (AvaJS) and other project dependencies.

The IBM Cloud CLI is installed using a shell script. Running a CLI sub-command installs the Cloud Functions plugin.

The Serverless Framework is installed as global NPM package (using npm -g install). The Apache OpenWhisk provider plugin is handled as normal project dependency, along with the test framework. Both those dependencies are installed using NPM.

install:
  - curl -fsSL https://clis.cloud.ibm.com/install/linux | sh
  - ibmcloud plugin install cloud-functions
  - npm install serverless -g
  - npm install

before_script

This phase is used to run unit tests, catching errors in core business logic, before setting up credentials (used in the script phase) for the acceptance test environment. Unit test failures will halt the build immediately, skipping test and production deployments.

Custom variables provide the API key, platform endpoint, organisation and space identifiers which are used for the test environment. The CLI is authenticated using these values, before running the ibmcloud fn api list command. This ensures Cloud Functions credentials are available locally, as used by The Serverless Framework.

before_script:
  - npm test
  - ibmcloud login --apikey $IBMCLOUD_API_KEY -a $IBMCLOUD_API_ENDPOINT
  - ibmcloud target -o $IBMCLOUD_ORG -s $IBMCLOUD_TEST_SPACE
  - ibmcloud fn api list > /dev/null
  - ibmcloud target

script

With the build system configured, the application can be deployed to test environment, followed by running acceptance tests. If either deployment or acceptance tests fail, the build will stop, skipping the production deployment.

Acceptance tests use an environment variable to configure the hostname test cases are executed against. The npm run acceptance-test alias command sets this value to the test environment hostname (apache-api-test.jamesthom.as) before running the test suite.

script:
  - sls deploy
  - npm run acceptance-test

before_deploy

Before deploying to production, Cloud Functions credentials need to be updated. The IBM Cloud CLI is used to target the production environment, before running a Cloud Functions CLI command. This updates local credentials with the production environment credentials.

before_deploy:
  - ibmcloud target -s $IBMCLOUD_PROD_SPACE
  - ibmcloud fn api list > /dev/null
  - ibmcloud target

deploy

If all the proceeding stages have successfully finished, the application can be deployed to the production. Following this final deployment, smoke tests are used to check production APIs still work as expected.

Smoke tests are just the same acceptance tests executed against the production environment. The npm run acceptance-prod alias command sets the hostname configuration value to the production environment (apache-api.jamesthom.as) before running the test suite.

deploy:
  provider: script
  script: sls deploy && npm run acceptance-prod
  skip_cleanup: true

Using the skip_cleanup parameter leaves installed artifacts from previous phases in the build environment. This means we don't have to re-install the IBM Cloud CLI, The Serverless Framework or NPM dependencies needed to run the production deployment and smoke tests.

success?

If all of the build phases are successful, the latest project code should have been deployed to the production environment. 💯💯💯

If the build failed due to unit test failures, the test suite can be ran locally to fix any errors. Deployment failures can be investigated using the console output logs from Travis CI. Acceptance test issues, against test or production environments, can be debugged by logging into those environments locally and running the test suite from my development machine.

Conclusion

Using Travis CI with The Serverless Framework and a JavaScript testing framework, I was able to set up a fully-automated CI/CD deployment pipeline for the Apache OpenWhisk release candidate verification tool.

Using a CI/CD pipeline, rather than a manual approach, for deployments has the following advantages...

No more manual and error-prone deploys relying on a human 👨‍💻 :)
Automatic unit & acceptance test execution catch errors before deployments.
Production environment only accessed by CI/CD system, reducing accidental breakages.
All cloud resources must be configured in code. No "snowflake" environments allowed.

Having finished code for new project features or bug fixes, all I have to do is push changes to the GitHub repository. This fires the Travis CI build pipeline which will automatically deploy the updated application to the production environment. If there are any issues, due to failed tests or deployments, I'll be notified by email.

This allows me to get back to adding new features to the tool (and fixing bugs) rather than wrestling with deployments, managing credentials for multiple environments and then trying to remember to run tests against the correct instances!

Serverless Machine Learning With TensorFlow.js

James Thomas — Mon, 13 Aug 2018 16:42:47 +0000

In a previous blog post, I showed how to use TensorFlow.js on Node.js to run visual recognition on images from the local filesystem. TensorFlow.js is a JavaScript version of the open-source machine learning library from Google.

Once I had this working with a local Node.js script, my next idea was to convert it into a serverless function. Running this function on IBM Cloud Functions (Apache OpenWhisk) would turn the script into my own visual recognition microservice.

Sounds easy, right? It's just a JavaScript library? So, zip it up and away we go... ahem 👊

Converting the image classification script to run in a serverless environment had the following challenges...

TensorFlow.js libraries need to be available in the runtime.
Native bindings for the library must be compiled against the platform architecture.
Models files need to be loaded from the filesystem.

Some of these issues were more challenging than others to fix! Let's start by looking at the details of each issue, before explaining how Docker support in Apache OpenWhisk can be used to resolve them all.

Challenges

TensorFlow.js Libraries

TensorFlow.js libraries are not included in the Node.js runtimes provided by the Apache OpenWhisk.

External libraries can be imported into the runtime by deploying applications from a zip file. Custom node_modules folders included in the zip file will be extracted in the runtime. Zip files are limited to a maximum size of 48MB.

Library Size

Running npm install for the TensorFlow.js libraries used revealed the first problem... the resulting node_modules directory was 175MB. 😱

Looking at the contents of this folder, the tfjs-node module compiles a native shared library (libtensorflow.so) that is 135M. This means no amount of JavaScript minification is going to get those external dependencies under the magic 48 MB limit. 👎

Native Dependencies

The libtensorflow.so native shared library must be compiled using the platform runtime. Running npm install locally automatically compiles native dependencies against the host platform. Local environments may use different CPU architectures (Mac vs Linux) or link against shared libraries not available in the serverless runtime.

MobileNet Model Files

TensorFlow models files need loading from the filesystem in Node.js. Serverless runtimes do provide a temporary filesystem inside the runtime environment. Files from deployment zip files are automatically extracted into this environment before invocations. There is no external access to this filesystem outside the lifecycle of the serverless function.

Models files for the MobileNet model were 16MB. If these files are included in the deployment package, it leaves 32MB for the rest of the application source code. Although the model files are small enough to include in the zip file, what about the TensorFlow.js libraries? Is this the end of the blog post? Not so fast....

Apache OpenWhisk's support for custom runtimes provides a simple solution to all these issues!

Custom Runtimes

Apache OpenWhisk uses Docker containers as the runtime environments for serverless functions (actions). All platform runtime images are published on Docker Hub, allowing developers to start these environments locally.

Developers can also specify custom runtime images when creating actions. These images must be publicly available on Docker Hub. Custom runtimes have to expose the same HTTP API used by the platform for invoking actions.

Using platform runtime images as parent images makes it simple to build custom runtimes. Users can run commands during the Docker build to install additional libraries and other dependencies. The parent image already contains source files with the HTTP API service handling platform requests.

TensorFlow.js Runtime

Here is the Docker build file for the Node.js action runtime with additional TensorFlow.js dependencies.

FROM openwhisk/action-nodejs-v8:latest

RUN npm install @tensorflow/tfjs @tensorflow-models/mobilenet @tensorflow/tfjs-node jpeg-js

COPY mobilenet mobilenet

openwhisk/action-nodejs-v8:latest is the Node.js action runtime image published by OpenWhisk.

TensorFlow libraries and other dependencies are installed using npm install in the build process. Native dependencies for the @tensorflow/tfjs-node library are automatically compiled for the correct platform by installing during the build process.

Since I'm building a new runtime, I've also added the MobileNet model files to the image. Whilst not strictly necessary, removing them from the action zip file reduces deployment times.

Want to skip the next step? Use this image jamesthomas/action-nodejs-v8:tfjs rather than building your own.

Building The Runtime

In the previous blog post, I showed how to download model files from the public storage bucket.

Download a version of the MobileNet model and place all files in the mobilenet directory.
Copy the Docker build file from above to a local file named Dockerfile.
Run the Docker build command to generate a local image.

docker build -t tfjs .

Tag the local image with a remote username and repository.

docker tag tfjs <USERNAME>/action-nodejs-v8:tfjs

Replace <USERNAME> with your Docker Hub username.

Push the local image to Docker Hub

 docker push <USERNAME>/action-nodejs-v8:tfjs

Once the image is available on Docker Hub, actions can be created using that runtime image. 😎

Example Code

This source code implements image classification as an OpenWhisk action. Image files are provided as a Base64 encoded string using the image property on the event parameters. Classification results are returned as the results property in the response.

Caching Loaded Models

Serverless platforms initialise runtime environments on-demand to handle invocations. Once a runtime environment has been created, it will be re-used for further invocations with some limits. This improves performance by removing the initialisation delay ("cold start") from request processing.

Applications can exploit this behaviour by using global variables to maintain state across requests. This is often use to cache opened database connections or store initialisation data loaded from external systems.

I have used this pattern to cache the MobileNet model used for classification. During cold invocations, the model is loaded from the filesystem and stored in a global variable. Warm invocations then use the existence of that global variable to skip the model loading process with further requests.

Caching the model reduces the time (and therefore cost) for classifications on warm invocations.

Memory Leak

Running the Node.js script from blog post on IBM Cloud Functions was possible with minimal modifications. Unfortunately, performance testing revealed a memory leak in the handler function. 😢

Reading more about how TensorFlow.js works on Node.js uncovered the issue...

TensorFlow.js's Node.js extensions use a native C++ library to execute the Tensors on a CPU or GPU engine. Memory allocated for Tensor objects in the native library is retained until the application explicitly releases it or the process exits. TensorFlow.js provides a dispose method on the individual objects to free allocated memory. There is also a tf.tidy method to automatically clean up all allocated objects within a frame.

Reviewing the code, tensors were being created as model input from images on each request. These objects were not disposed before returning from the request handler. This meant native memory grew unbounded. Adding an explicit dispose call to free these objects before returning fixed the issue.

Profiling & Performance

Action code records memory usage and elapsed time at different stages in classification process.

Recording memory usage allows me to modify the maximum memory allocated to the function for optimal performance and cost. Node.js provides a standard library API to retrieve memory usage for the current process. Logging these values allows me to inspect memory usage at different stages.

Timing different tasks in the classification process, i.e. model loading, image classification, gives me an insight into how efficient classification is compared to other methods. Node.js has a standard library API for timers to record and print elapsed time to the console.

Demo

Deploy Action

Run the following command with the IBM Cloud CLI to create the action.

ibmcloud fn action create classify --docker <IMAGE_NAME> index.js

Replace <IMAGE_NAME> with the public Docker Hub image identifier for the custom runtime. Use jamesthomas/action-nodejs-v8:tfjs if you haven't built this manually.

Testing It Out

Download this image of a Panda from Wikipedia.

wget http://bit.ly/2JYSal9 -O panda.jpg

Invoke the action with the Base64 encoded image as an input parameter.

 ibmcloud fn action invoke classify -r -p image $(base64 panda.jpg)

Returned JSON message contains classification probabilities. 🐼🐼🐼

{
  "results":  [{
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557
  }]
}

Activation Details

Retrieve logging output for the last activation to show performance data.

ibmcloud fn activation logs --last

Profiling and memory usage details are logged to stdout

prediction function called.
memory used: rss=150.46 MB, heapTotal=32.83 MB, heapUsed=20.29 MB, external=67.6 MB
loading image and model...
decodeImage: 74.233ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=40.63 MB
imageByteArray: 5.676ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.05 MB, external=45.51 MB
imageToInput: 5.952ms
memory used: rss=141.8 MB, heapTotal=24.33 MB, heapUsed=19.06 MB, external=45.51 MB
mn_model.classify: 274.805ms
memory used: rss=149.83 MB, heapTotal=24.33 MB, heapUsed=20.57 MB, external=45.51 MB
classification results: [...]
main: 356.639ms
memory used: rss=144.37 MB, heapTotal=24.33 MB, heapUsed=20.58 MB, external=45.51 MB

main is the total elapsed time for the action handler. mn_model.classify is the elapsed time for the image classification. Cold start requests print an extra log message with model loading time, loadModel: 394.547ms.

Performance Results

Invoking the classify action 1000 times for both cold and warm activations (using 256MB memory) generated the following performance results.

warm invocations

Classifications took an average of 316 milliseconds to process when using warm environments. Looking at the timing data, converting the Base64 encoded JPEG into the input tensor took around 100 milliseconds. Running the model classification task was in the 200 - 250 milliseconds range.

cold invocations

Classifications took an average of 1260 milliseconds to process when using cold environments. These requests incur penalties for initialising new runtime containers and loading models from the filesystem. Both of these tasks took around 400 milliseconds each.

One disadvantage of using custom runtime images in Apache OpenWhisk is the lack of pre-warmed containers. Pre-warming is used to reduce cold start times by starting runtime containers before they are needed. This is not supported for non-standard runtime images.

classification cost

IBM Cloud Functions provides a free tier of 400,000 GB/s per month. Each further second of execution is charged at $0.000017 per GB of memory allocated. Execution time is rounded up to the nearest 100ms.

If all activations were warm, a user could execute more than 4,000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 600,000 further invocations would cost just over $1.

If all activations were cold, a user could execute more than 1,2000,000 classifications per month in the free tier using an action with 256MB. Once outside the free tier, around 180,000 further invocations would cost just over $1.

Conclusion

TensorFlow.js brings the power of deep learning to JavaScript developers. Using pre-trained models with the TensorFlow.js library makes it simple to extend JavaScript applications with complex machine learning tasks with minimal effort and code.

Getting a local script to run image classification was relatively simple, but converting to a serverless function came with more challenges! Apache OpenWhisk restricts the maximum application size to 50MB and native libraries dependencies were much larger than this limit.

Fortunately, Apache OpenWhisk's custom runtime support allowed us to resolve all these issues. By building a custom runtime with native dependencies and models files, those libraries can be used on the platform without including them in the deployment package.

Machine Learning In Node.js With TensorFlow.js

James Thomas — Thu, 09 Aug 2018 16:41:32 +0000

TensorFlow.js is a new version of the popular open-source library which brings deep learning to JavaScript. Developers can now define, train, and run machine learning models using the high-level library API.

Pre-trained models mean developers can now easily perform complex tasks like visual recognition, generating music or detecting human poses with just a few lines of JavaScript.

Having started as a front-end library for web browsers, recent updates added experimental support for Node.js. This allows TensorFlow.js to be used in backend JavaScript applications without having to use Python.

Reading about the library, I wanted to test it out with a simple task... 🧐

Use TensorFlow.js to perform visual recognition on images using JavaScript from Node.js

Unfortunately, most of the documentation and example code provided uses the library in a browser. Project utilities provided to simplify loading and using pre-trained models have not yet been extended with Node.js support. Getting this working did end up with me spending a lot of time reading the Typescript source files for the library. 👎

However, after a few days' hacking, I managed to get this completed! Hurrah! 🤩

Before we dive into the code, let's start with an overview of the different TensorFlow libraries.

TensorFlow

TensorFlow is an open-source software library for machine learning applications. TensorFlow can be used to implement neural networks and other deep learning algorithms.

Released by Google in November 2015, TensorFlow was originally a Python library. It used either CPU or GPU-based computation for training and evaluating machine learning models. The library was initially designed to run on high-performance servers with expensive GPUs.

Recent updates have extended the software to run in resource-constrained environments like mobile devices and web browsers.

TensorFlow Lite

Tensorflow Lite, a lightweight version of the library for mobile and embedded devices, was released in May 2017. This was accompanied by a new series of pre-trained deep learning models for vision recognition tasks, called MobileNet. MobileNet models were designed to work efficiently in resource-constrained environments like mobile devices.

TensorFlow.js

Following Tensorflow Lite, TensorFlow.js was announced in March 2018. This version of the library was designed to run in the browser, building on an earlier project called deeplearn.js. WebGL provides GPU access to the library. Developers use a JavaScript API to train, load and run models.

TensorFlow.js was recently extended to run on Node.js, using an extension library called tfjs-node.

The Node.js extension is an alpha release and still under active development.

Importing Existing Models Into TensorFlow.js

Existing TensorFlow and Keras models can be executed using the TensorFlow.js library. Models need converting to a new format using this tool before execution. Pre-trained and converted models for image classification, pose detection and k-nearest neighbours are available on Github.

Using TensorFlow.js in Node.js

Installing TensorFlow Libraries

TensorFlow.js can be installed from the NPM registry.

@tensorflow/tfjs - Core TensorFlow.js library
@tensorflow/tfjs-node - TensorFlow.js Node.js extension
@tensorflow/tfjs-node-gpu - TensorFlow.js Node.js extension with GPU support

npm install @tensorflow/tfjs @tensorflow/tfjs-node
// or...
npm install @tensorflow/tfjs @tensorflow/tfjs-node-gpu

Both Node.js extensions use native dependencies which will be compiled on demand.

Loading TensorFlow Libraries

TensorFlow's JavaScript API is exposed from the core library. Extension modules to enable Node.js support do not expose additional APIs.

const tf = require('@tensorflow/tfjs')
// Load the binding (CPU computation)
require('@tensorflow/tfjs-node')
// Or load the binding (GPU computation)
require('@tensorflow/tfjs-node-gpu')

Loading TensorFlow Models

TensorFlow.js provides an NPM library (tfjs-models) to ease loading pre-trained & converted models for image classification, pose detection and k-nearest neighbours.

The MobileNet model used for image classification is a deep neural network trained to identify 1000 different classes.

In the project's README, the following example code is used to load the model.

import * as mobilenet from '@tensorflow-models/mobilenet';

// Load the model.
const model = await mobilenet.load();

One of the first challenges I encountered was that this does not work on Node.js.

Error: browserHTTPRequest is not supported outside the web browser.

Looking at the source code, the mobilenet library is a wrapper around the underlying tf.Model class. When the load() method is called, it automatically downloads the correct model files from an external HTTP address and instantiates the TensorFlow model.

The Node.js extension does not yet support HTTP requests to dynamically retrieve models. Instead, models must be manually loaded from the filesystem.

After reading the source code for the library, I managed to create a work-around...

Loading Models From a Filesystem

Rather than calling the module's load method, if the MobileNet class is created manually, the auto-generated path variable which contains the HTTP address of the model can be overwritten with a local filesystem path. Having done this, calling the load method on the class instance will trigger the filesystem loader class, rather than trying to use the browser-based HTTP loader.

const path = "mobilenet/model.json"
const mn = new mobilenet.MobileNet(1, 1);
mn.path = `file://${path}`
await mn.load()

Awesome, it works!

But how where do the models files come from?

MobileNet Models

Models for TensorFlow.js consist of two file types, a model configuration file stored in JSON and model weights in a binary format. Model weights are often sharded into multiple files for better caching by browsers.

Looking at the automatic loading code for MobileNet models, models configuration and weight shards are retrieved from a public storage bucket at this address.

https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v${version}_${alpha}_${size}/

The template parameters in the URL refer to the model versions listed here. Classification accuracy results for each version are also shown on that page.

According to the source code, only MobileNet v1 models can be loaded using the tensorflow-models/mobilenet library.

The HTTP retrieval code loads the model.json file from this location and then recursively fetches all referenced model weights shards. These files are in the format groupX-shard1of1.

Downloading Models Manually

Saving all model files to a filesystem can be achieved by retrieving the model configuration file, parsing out the referenced weight files and downloading each weight file manually.

I want to use the MobileNet V1 Module with 1.0 alpha value and image size of 224 pixels. This gives me the following URL for the model configuration file.

https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/model.json

Once this file has been downloaded locally, I can use the jq tool to parse all the weight file names.

$ cat model.json | jq -r ".weightsManifest[].paths[0]"
group1-shard1of1
group2-shard1of1
group3-shard1of1
...

Using the sed tool, I can prefix these names with the HTTP URL to generate URLs for each weight file.

$ cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//'
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group1-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group2-shard1of1
https://storage.googleapis.com/tfjs-models/tfjs/mobilenet_v1_1.0_224/group3-shard1of1
...

Using the parallel and curl commands, I can then download all of these files to my local directory.

cat model.json | jq -r ".weightsManifest[].paths[0]" | sed 's/^/https:\/\/storage.googleapis.com\/tfjs-models\/tfjs\/mobilenet_v1_1.0_224\//' |  parallel curl -O

Classifying Images

This example code is provided by TensorFlow.js to demonstrate returning classifications for an image.

const img = document.getElementById('img');

// Classify the image.
const predictions = await model.classify(img);

This does not work on Node.js due to the lack of a DOM.

The classify method accepts numerous DOM elements (canvas, video, image) and will automatically retrieve and convert image bytes from these elements into a tf.Tensor3D class which is used as the input to the model. Alternatively, the tf.Tensor3D input can be passed directly.

Rather than trying to use an external package to simulate a DOM element in Node.js, I found it easier to construct the tf.Tensor3D manually.

Generating Tensor3D from an Image

Reading the source code for the method used to turn DOM elements into Tensor3D classes, the following input parameters are used to generate the Tensor3D class.

const values = new Int32Array(image.height * image.width * numChannels);
// fill pixels with pixel channel bytes from image
const outShape = [image.height, image.width, numChannels];
const input = tf.tensor3d(values, outShape, 'int32');

pixels is a 2D array of type (Int32Array) which contains a sequential list of channel values for each pixel. numChannels is the number of channel values per pixel.

Creating Input Values For JPEGs

The jpeg-js library is a pure javascript JPEG encoder and decoder for Node.js. Using this library the RGB values for each pixel can be extracted.

const pixels = jpeg.decode(buffer, true);

This will return a Uint8Array with four channel values (RGBA) for each pixel (width * height). The MobileNet model only uses the three colour channels (RGB) for classification, ignoring the alpha channel. This code converts the four channel array into the correct three channel version.

const numChannels = 3;
const numPixels = image.width * image.height;
const values = new Int32Array(numPixels * numChannels);

for (let i = 0; i < numPixels; i++) {
  for (let channel = 0; channel < numChannels; ++channel) {
    values[i * numChannels + channel] = pixels[i * 4 + channel];
  }
}

MobileNet Models Input Requirements

The MobileNet model being used classifies images of width and height 224 pixels. Input tensors must contain float values, between -1 and 1, for each of the three channels pixel values.

Input values for images of different dimensions needs to be re-sized before classification. Additionally, pixels values from the JPEG decoder are in the range 0 - 255, rather than -1 to 1. These values also need converting prior to classification.

TensorFlow.js has library methods to make this process easier but, fortunately for us, the tfjs-models/mobilenet library automatically handles this issue! 👍

Developers can pass in Tensor3D inputs of type int32 and different dimensions to the classify method and it converts the input to the correct format prior to classification. Which means there's nothing to do... Super 🕺🕺🕺.

Obtaining Predictions

MobileNet models in Tensorflow are trained to recognise entities from the top 1000 classes in the ImageNet dataset. The models output the probabilities that each of those entities is in the image being classified.

The full list of trained classes for the model being used can be found in this file.

The tfjs-models/mobilenet library exposes a classify method on the MobileNet class to return the top X classes with highest probabilities from an image input.

const predictions = await mn_model.classify(input, 10);

predictions is an array of X classes and probabilities in the following format.

{
  className: 'panda',
  probability: 0.9993536472320557
}

Example

Having worked how to use the TensorFlow.js library and MobileNet models on Node.js, this script will classify an image given as a command-line argument.

source code

Save this script file and package descriptor to local files.

testing it out

Download the model files to a mobilenet directory using the instructions above.
Install the project dependencies using NPM

npm install

Download a sample JPEG file to classify

wget http://bit.ly/2JYSal9 -O panda.jpg

Run the script with the model file and input image as arguments.

node script.js mobilenet/model.json panda.jpg

If everything worked, the following output should be printed to the console.

classification results: [ {
    className: 'giant panda, panda, panda bear, coon bear',
    probability: 0.9993536472320557 
} ]

The image is correctly classified as containing a Panda with 99.93% probability! 🐼🐼🐼

Conclusion

Having been released as a browser-based library, TensorFlow.js has now been extended to work on Node.js, although not all of the tools and utilities support the new runtime. With a few days' hacking, I was able to use the library with the MobileNet models for visual recognition on images from a local file.

Getting this working in the Node.js runtime means I now move on to my next idea... making this run inside a serverless function! Come back soon to read about my next adventure with TensorFlow.js. 👋