DEV Community

Cover image for Prepare complex machine learning applications for Raspberry Pi using AWS Graviton Processors
Jason Andrews for AWS Community Builders

Posted on

Prepare complex machine learning applications for Raspberry Pi using AWS Graviton Processors

In AWS Graviton processors expand options for embedded Linux developers, I explained how Graviton-based EC2 instances play a role in embedded Linux software development. Machine learning is one area where this impact is occurring. Machine learning frameworks are typically C++ projects with dependencies on many underlying libraries. This makes them difficult to customize, build, and deploy on Arm-based edge compute.

There are many articles about how to get machine learning frameworks and libraries on embedded Linux boards. Authors try things like cross-compiling, instruction translation with qemu, or just brute force native building on a board like the Raspberry Pi. Today, I have an example of how to do this another way which is easier and saves time.

MxNet is a flexible and efficient library for deep learning. It is known to be difficult to build for the Raspberry Pi 3B or 4. The challenge is primarily due to the large amount of memory required. Using Docker to prepare a container image is an option. Pre-built Python wheel files can also be found, but they quickly get out of date if they are supplied by community members.

The process below demonstrates how to prepare a Raspberry Pi OS SD card using an AWS Graviton-based EC2 instance. The image includes MxNet and is ready to write to an SD card without the need to use Docker, cross-compile, or wait hours for a native build on the Pi.

Let’s get started.

First, create a new EC2 instance. The instance can be any of the instance types powered by Graviton processors including A1, T4g, M6g, C6g, or R6g. There are numerous AWS tutorials on how to create an AWS account and use the AWS Console to configure and launch a new EC2 instance.

For this example, I created a t4g.2xlarge instance running Ubuntu 18.04. This gives me 8 vCPUs and 16 Gb of memory to work with.

Connect to the new instance using ssh and the key file assigned to the instance.

$ ssh -i key.pem ubuntu@<ec2-ip-address>
Enter fullscreen mode Exit fullscreen mode

Now in the EC2 instance, get the latest Raspberry Pi OS image and unzip it.

$ wget http://downloads.raspberrypi.org/raspios_lite_arm64/images/raspios_lite_arm64-2020-08-24/2020-08-20-raspios-buster-arm64-lite.zip
$ sudo apt install unzip
$ unzip 2020-08-20-raspios-buster-arm64-lite.zip
Enter fullscreen mode Exit fullscreen mode

Increase the size of the Raspberry Pi OS image, mount it, and use chroot to prepare for the MxNet build. Use any loop device which is not currently used in the system. I use number 10 here, but select a higher number if 10 is already in use.

$ sudo losetup -P /dev/loop10 2020-08-20-raspios-buster-arm64-lite.img
$ sudo fallocate -l 8000M 2020-08-20-raspios-buster-arm64-lite.img
$ sudo losetup -c /dev/loop10
$ sudo parted /dev/loop10
GNU Parted 3.3
Using /dev/loop10
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print free                                                       
Model: Loopback device (loopback)
Disk /dev/loop10: 8389MB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
        16.4kB  4194kB  4178kB           Free Space
 1      4194kB  273MB   268MB   primary  fat32        lba
 2      273MB   2001MB  1728MB  primary  ext4
        2001MB  8389MB  6388MB           Free Space

(parted) resizepart 2                                                     
End?  [2001MB]? 8389MB                                                    
(parted) q                                                                
Information: You may need to update /etc/fstab.

$ sudo e2fsck -f /dev/loop10p2
$ sudo resize2fs  /dev/loop10p2
$ sudo mount /dev/loop10p2 /mnt
$ sudo mount /dev/loop10p1 /mnt/boot
$ cd /mnt
$ sudo mount -t proc /proc proc/
$ sudo mount --rbind /sys sys/
$ sudo mount --rbind /dev dev/
$ sudo chroot /mnt /bin/bash
Enter fullscreen mode Exit fullscreen mode

Install required software for the MxNet build.

# apt update
# apt upgrade -y
# apt-get -y install git cmake ninja-build liblapack* libblas* libopencv* libopenblas* python3-dev python3-pip python-dev virtualenv
# pip3 install Cython
Enter fullscreen mode Exit fullscreen mode

Change to user pi and build MxNet.

$ su pi
$ cd $HOME
$ git clone https://github.com/apache/incubator-mxnet.git --recursive
$ cd incubator-mxnet
$ mkdir build
$ cd build
$ cmake \
-DUSE_SSE=OFF \
-DUSE_CUDA=OFF \
-DUSE_OPENCV=ON \
-DUSE_OPENMP=ON \
-DUSE_MKL_IF_AVAILABLE=OFF \
-DUSE_SIGNAL_HANDLER=ON \
-DBUILD_CYTHON_MODULES=ON \
-DCMAKE_BUILD_TYPE=Release \
-GNinja ..
$ ninja -j16
$ cd ../python
$ sudo pip3 install -e . 
Enter fullscreen mode Exit fullscreen mode

Building MxNet takes less than 20 minutes on the EC2 instance. It would be even less with a larger instance.

A native build on a Raspberry Pi 4 can be done using the steps above. Setting the number of jobs too high will result in out of memory failures. I tried -j4 and the build failed on a Raspberry Pi 4 with 8 Gb of RAM. I ran ninja with -j1 and the build completed, but it took over 6 1/2 hours.

Test the result with Python.

$ python3
Python 3.7.3 (default, Jul 25 2020, 13:03:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
>>> mxnet.__version__
'2.0.0'
>>> quit()
Enter fullscreen mode Exit fullscreen mode

To move to the Raspberry Pi board simply download the .img, write it to an SD card using dd or the Raspberry Pi imager, and power up a Raspberry Pi. The first boot will resize the file system based on the size of the SD card. Login with pi as the username and MxNet is immediately available.

Conclusion

AWS Graviton processors augment the development environment for projects targeting high-performance IoT applications and other embedded projects using Linux on Arm. Graviton processors make it easier and faster to develop, build, and test software.

Top comments (0)