In a recent post I showed up how challenging it still is to build TensorFlow C bindings for Raspberry Pi and other SBCs (Single Board Computer) and the lack of pre-build binaries.
As you could read, I was successful with certain approach (cross-compiling with a RaspberryPi-only script) but I wasn’t yet able to compile on the target (RaspberryPi 3 in this example) and I still had quite some questions open.
In this post I will show some more successful attempts as well as answer some of my previous questions.
Successful attempt: build from scratch on Raspberry Pi 3B+ and Raspbian Buster
I suspected one of the reasons of why I failed to do this on the first place was the amount of the swap space as well as free hard disk space.
It was the perfect excuse to buy another 64GB fast micro SD, try Raspbian Buster and start from a fresh install. So, that’s what I did.
If you remember from the previous post, the first thing you needed to do was to install Bazel. In this case, the first thing I did is to give it a lot of swap space. Instead of 2GB as I did before, I now assigned 8GB.
Next, remember you don’t need any version of Bazel. The version that TensorFlow would need. One open question I had from previous post was “how do I know that?” And here is the answer: you must check the file tensorflow/tensorflow/tools/ci_build/install/install_bazel.sh
For example, for 1.13.1 you can see:
# Select bazel version.
BAZEL_VERSION="0.20.0"
set +e
local_bazel_ver=$(bazel version 2>&1 | grep -i label | awk '{print $3}')
if [[ "$local_bazel_ver" == "$BAZEL_VERSION" ]]; then
exit 0
fi
OK, so I started compiling Bazel following the instructions of the previous post and I found out problems with the Java VM. I am sorry, but I didn’t write down which was the exact issue. But it was an error clearly related with some Java compilation that I did NOT have with my previous attempt.
My first step was to check JVM versions (java --version
) of this Raspbian Buster vs the Raspbian Stretch I used before. On the latter it showedjava version “1.8.0_65″
while in the former openjdk 11.0.3 2019-04-16
. OK…so Stretch came with Java 8 while Buster with 11.
This is when I imagined that maybe Bazel could be build only with a given Java version in particular. Which one? No clue (tell me if you do). So what I did on my fresh Raspbian Buster is to install java 8 too:
sudo apt-get install openjdk-8-jdk
After that, you can check which are the alternatives:
$ update-java-alternatives -l
java-1.11.0-openjdk-armhf 1111 /usr/lib/jvm/java-1.11.0-openjdk-armhf
java-1.8.0-openjdk-armhf 1081 /usr/lib/jvm/java-1.8.0-openjdk-armhf
Obviously, I didn’t want to change the default Java version for my whole OS, so I just made the following workaround:
env BAZEL_JAVAC_OPTS="-J-Xms384m -J-Xmx1024m" \
JAVA_TOOL_OPTS="-Xmx1024m" \
JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-armhf" \
EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" \
bash ./compile.sh
Basically, I am just telling Bazel to use Java 8. After that, it could finish correctly. The next step was to build TensorFlow.
I started my build process but I hit another compilation issue related to AWS which I could workaround by just telling to NOT compile AWS….
Finally, after 13 hours (yes, that’s the expected time) I could successfully build it:
Liquid error: internal
The final Bazel compilation script for TensorFlow was like this:
env JAVA_HOME="/usr/lib/jvm/java-1.8.0-openjdk-armhf" \
bazel --host_jvm_args=-Xmx1024m --host_jvm_args=-Xms384m build \
--config=noaws \
--config opt --verbose_failures --local_resources 1024,1.0,1.0 \
--copt=-mfpu=neon-vfpv4 \
--copt=-ftree-vectorize \
--copt=-funsafe-math-optimizations \
--copt=-ftree-loop-vectorize \
--copt=-fomit-frame-pointer \
--copt=-DRASPBERRY_PI \
--host_copt=-mfpu=neon-vfpv4 \
--host_copt=-ftree-vectorize \
--host_copt=-funsafe-math-optimizations \
--host_copt=-ftree-loop-vectorize \
--host_copt=-fomit-frame-pointer \
--host_copt=-DRASPBERRY_PI \
//tensorflow/tools/lib_package:libtensorflow
Interesting points:
- I added the
JAVA_HOME
workaround for Java 8. - I removed the original
--jobs=3
that I used before because didn’t seem to change much. - I added the
--config=noaws
workaround for AWS. - As I knew it would take a lot of time to compile I wanted to double check the CPU temperature, so every in a while I run
/opt/vc/bin/vcgencmd measure_temp
and check its results.
Once it finishes, you may want to copy the resulting files somewhere. In my case I did:
sudo cp bazel-bin/tensorflow/libtensorflow.so /usr/local/lib/
sudo cp bazel-bin/tensorflow/libtensorflow_framework.so /usr/local/lib/
Failed attempt (but with learnings!): build from scratch on Raspberry Pi 3B+ and Ubuntu Server 18.04 and aarch64 (ARM 64 bits)
Once I had ARM 32 bits working, my next step was to try on aarch64 (ARM 64 bits). At the time of this writing, there is no official 64 bits Raspbian version. It’s not the first time I want to run something on the Pi3B+ with aarch64 so I already had a micro SD with Ubuntu Server 18.04 up and running.
I started eating my own dog food and here it comes the first issue: the way to change the swap partition is not the same with Raspbian and Ubuntu. So I followed this guide for Ubuntu and assigned also 8GB.
Second, it seems you need python
installed. No, this Ubuntu did not have any python installed. I then understood why some blog posts started with a “first, install dependencies..” and provided below line:
sudo apt-get install gcc g++ swig build-essential openjdk-8-jdk python zip unzip
I let you decide if you want Python 2.x or 3x. As we found out before, the JDK version is also important.
Anyway, after those 2 issues, I was able to compile and run bazel. However, as soon as I tried to run the previous bazel script for TensorFlow I got a compilation error saying that --copt=-mfpu=neon-vfpv4
was not a recognized option:
Thanks to freedomtan he told me that I don’t need all those extra --copt
s and --host_copt
s (they are for complicated Raspian environ). So the bazel script should then be something simpler:
bazel --host_jvm_args=-Xmx1024m --host_jvm_args=-Xms384m build \
--config=noaws \
--config opt --verbose_failures --local_resources 1024,1.0,1.0 \
//tensorflow/tools/lib_package:libtensorflow
I thought this was going to work, but after many hours it just didn’t finish. It somehow hung. I didn’t fight any longer because this Ubuntu installation was never really stable for me. So I will try this again anytime soon with a fresh install of Debian Buster or something for aarch64.
Successful attempt: convinced someone else to build it!
In the previous post I commented about this Github repo providing binaries for TensorFlow and Raspberry Pi. The drawback was that all binaries they ship were Python only and not the C shared library.
After a nice discussion with the author, he now build it for ARM and ARM64!! And seem to have included it as part of his release process. But that’s not all. It seems that even if Google itself provides official Python wheels for Raspberry Pi, there are many people that still uses his builds. Why? He explains himself:
The difference between my Wheel file and the official Wheel is the following two points.
* Official Wheel is disabled by default on Tensorflow Lite. However, my wheel is enabled by default.
* Tuned to 2.5 times the performance of the official Tensorflow Lite.
It is 1. that is considered a problem by engineers around the world.
You can read more in above link and see how to get those binaries! BTW, the same author also provides binaries of Bazel for ARM!
Future attempts (yes, the journey is not over!)
- Cross-compile myself (any combination for this purpose) using these instructions. The main issue with this kind of compilation is that for some reason the resulting shared library seems linked with a concrete glibc version. Therefore, its usage is limited because it will only work with operating systems matching that glibc (you can try multi-glibc in the same host but it seems a pain). If you know how we can fix this, please let us know.
- Cross-compile myself using this other instructions which seems interesting too. Unfortunately the author does not show too much interest in providing C bindings.
- Try again aarch64 compilation on the Pi but with a different OS (not Ubuntu).
Top comments (0)