DEV Community

Cover image for Processing File Uploads with IBM Cloud Functions
Matt Hamilton for IBM Developer

Posted on • Updated on

Processing File Uploads with IBM Cloud Functions

I've been doing a lot with IBM Cloud Functions recently. So far I've always been passing up simple query string values or json-encoded data. But what if you want to upload an actual binary file? The usual way, in the web world, is to use the multipart/form-data content type.

But how does Apache Openwhisk (that IBM Cloud Functions are based on) handle this?

If you set the function to raw type, rather than getting the JSON-decoded values passed to your function as a dictionary, you get a dictionary that gives you the raw data. The data itself if base64 encoded in __ow_body and the headers in __ow_headers.

There are several ways to process this in Python. The requests_toolbelt package has a MultipartDecoder class. But I'm going to use one from the cgi library that is included in Python itself:

from cgi import parse_multipart, parse_header
from io import BytesIO
from base64 import b64decode

def main(args):
    c_type, p_dict = parse_header(args['__ow_headers']['content-type'])
    decoded_string = b64decode(args['__ow_body'])
    p_dict['boundary'] = bytes(p_dict['boundary'], "utf-8")
    p_dict['CONTENT-LENGTH'] = len(decoded_string)
    form_data = parse_multipart(BytesIO(decoded_string), p_dict)

    # Do something with the data. In this simple example
    # we will just return a dict with the part name and 
    # length of the content of that part
    ret = {}
    for key, value in form_data.items():
        ret[key] = len(value[0])

    return ret
Enter fullscreen mode Exit fullscreen mode

The parse_multipart method expects to find a header called CONTENT-LENGTH so we need to add that in to our headers before passing it to the parser.

To create this cloud function, we need to set --web raw flag to tell Openwhisk not to try and parse the payload as JSON for us:

% ibmcloud fn action create upload upload.py --web raw
Enter fullscreen mode Exit fullscreen mode

We can then get the URL for it:

% ic fn action get upload --url
ok: got action upload
https://eu-gb.functions.appdomain.cloud/api/v1/web/1d0ffa5a-835d-4c40-ac80-77ca4a35f028/upload
Enter fullscreen mode Exit fullscreen mode

And we can then call it using cURL and upload some data to it. In this case we are passing both a simple string value (foo) and the contents of a binary WAV audio file:

% curl -F id=foo -F audio=@test.wav  https://eu-gb.functions.appdomain.cloud/api/v1/web/1d0ffa5a-835d-4c40-ac80-77ca4a35f028/upload.json   
{
  "audio": 3840102,
  "id": 3
}                    
Enter fullscreen mode Exit fullscreen mode

It seems that IBM Cloud functions has a limit of 5MB for the payload. I guess anything larger than this and you'd be better to upload it to IBM Cloud Object Storage (COS) then pass the URL of that uploaded item to the cloud function.

Oldest comments (1)

Collapse
 
janipm profile image
Jani Siivola

Thank you sir :-) Today I was just struggling on how AWS Api Gateway HttpApi could handle multipart/form-data and this same IBM example works perfectly there as well.