DEV Community

wolfroma
wolfroma

Posted on

Ampere Altra vs IceLake and Epyc for Java applications (ARM/Intel/AMD)

Goal of the story is to compare suitability of Altra processors for Java based applications comparing to Intel/AMD processors.

ARM processors are poised to play a pivotal role in the future of cloud computing. Leading cloud service providers have embraced ARM processors, enhancing their capabilities. The primary driving factors behind this adoption are cost-efficiency and improved performance for a broad range of workloads. Nevertheless, the suitability of ARM processors largely hinges on the specific workload they are expected to handle within a virtual machine (VM). Given the widespread distribution of Java applications, it becomes imperative to explore whether migrating Java applications to Ampere Altra processors is a wise move.

Introduction

AWS was the first one that came up with Graviton processors to the market. I see how Graviton processors have improved over time. I’ve been using Graviton 2 processors in my work, they were good but had a problem working with encryption related functions ( HTTPS/TLS/SSL). The reason was that Graviton 2 had much less multiplication units comparing with Intel based processor (m5 series). Graviton 3 processors have been improved to have more multiplications units, it helped to speed up encryption related workloads in 2x times at least. We will compare Azure D3 VM type with Ampere Altra based VM and we will on the specific workload type that I believe most Java applications use.

Typical Cloud Java Application

There is a basic phycological principle: “While individual differences exist among all humans, there are fundamental commonalities in the way people think and perceive the world.” This principle recognizes that despite unique characteristics and experiences, there are universal cognitive and psychological processes that underlie human thought and behavior. The same can be said about Java applications. Let’s describe a typical Java application.

A typical Java application in Cloud world is a microservice that uses SpringBoot. It’s have a backend storage and connect to it in secure way using SSL/TLS connections. Incoming requests are also securely protected by HTTPS protocol with TLS encryption. It uses a standard JDK that uses underlying Operation System abilities to accept incoming TCP connections and perform encryptions using OpenSSL (JDK delegates decryption to OS that delegates to a standard encryption provider like OpenSSL). Typical Java application doesn’t use external native components since most of components are already provided by third-party libraries.

Methodology

We need to compare all components on the stack to ensure that all of their performance on the line:

  • CPU & Memory benchmark ( sysbench)
  • OpenSSL benchmark (openssl speed & phoronix)
  • JVM benchmark (Open JDK 11 and 17) I understand that some might think that we can do more tests but I believe these kind of tests will give a representative data for the typical Cloud Java application. If you are determined to run more tests I would recommend to include tests from ( JavaGC and benchmark a sample SpringBoot application, e.g. by using Vegeta)

We will use comparable VMs in Azure cloud with CentOS based system.

  • Standard_D4pds_v5 — 4 vCPU, 16 GB memory, Ampere® Altra® Arm-based processor operating at 3.0 GHz ($154.76 per month, pay as you go plan on Nov 4 2023)
  • Standard_D4ads_v5 — 4 vCPU, 16 GB memory, 3rd Generation EPYCTM 7763 processor operating at 3.5 GHz ($178.12 per month, pay as you go plan on Nov 4 2023)
  • Standard_D4d_v4 — 3rd Generation Intel® Xeon® Platinum 8370C (Ice Lake) processor operating at 3.5 GHz ($194.18 per month, pay as you go plan on Nov 4 2023)

Script for configuring VM and run benchmarks

set -eu -o pipefail

curl -LJO https://github.com/phoronix-test-suite/phoronix-test-suite/archive/refs/heads/master.zip
unzip phoronix-test-suite-master.zip
cd phoronix-test-suite-master/
sudo yum install php -y
sudo yum install php-xml -y
sudo ./phoronix-test-suite benchmark sysbench
#3: Test All Options
#Would you like to save these test results (Y/n): Y
#Enter a name for the result file: <altra|icelake|epyc>_sysbench_cpu
#Enter a unique name to describe this test run / configuration: icelake_sysbench_cpu

openssl version
#OpenSSL 1.0.2k-fips  26 Jan 2017
openssl speed > openssl_results.txt

sudo yum install perl-IPC-Cmd perl-Test-Simple -y

# install openssl 3.1.4
cd /usr/src
sudo wget https://www.openssl.org/source/openssl-3.1.4.tar.gz
sudo tar -zxf openssl-3.1.4.tar.gz
sudo rm openssl-3.1.4.tar.gz

cd openssl-3.1.4/
sudo ./config
sudo make
sudo make test
sudo make install


if [[ "$(arch)" == x86_64 ]] ; then
  #x86_64
  sudo ln -s /usr/local/lib64/libssl.so.3 /usr/lib64/libssl.so.3
  sudo ln -s /usr/local/lib64/libcrypto.so.3 /usr/lib64/libcrypto.so.3
else
  #ARM
  sudo ln -s /usr/local/lib/libssl.so.3 /usr/lib64/libssl.so.3
  sudo ln -s /usr/local/lib/libcrypto.so.3 /usr/lib64/libcrypto.so.3 
fi

openssl version
#OpenSSL 3.1.4 24 Oct 2023 (Library: OpenSSL 3.1.4 24 Oct 2023)
cd ~/phoronix-test-suite-master
sudo ./phoronix-test-suite benchmark openssl

sudo tar -zcvf testresults.tar.gz /var/lib/phoronix-test-suite/test-results/*

cd ~
curl -LJO https://github.com/ionutbalosin/jvm-performance-benchmarks/archive/refs/heads/main.zip
unzip jvm-performance-benchmarks-main.zip


if [[ "$(arch)" == x86_64 ]] ; then
  #x64 OpenJDK
  curl -LJO https://github.com/adoptium/temurin11-binaries/releases/download/jdk-11.0.21%2B9/OpenJDK11U-jdk_x64_linux_hotspot_11.0.21_9.tar.gz
  tar -xvf OpenJDK11U-jdk_x64_linux_hotspot_11.0.21_9.tar.gz

  curl -LJO https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.9%2B9/OpenJDK17U-jdk_x64_linux_hotspot_17.0.9_9.tar.gz
  tar -xvf OpenJDK17U-jdk_x64_linux_hotspot_17.0.9_9.tar.gz

  #curl -LJO https://github.com/adoptium/temurin21-binaries/releases/download/jdk-21.0.1%2B12/OpenJDK21U-jdk_x64_linux_hotspot_21.0.1_12.tar.gz
  #tar -xvf OpenJDK21U-jdk_x64_linux_hotspot_21.0.1_12.tar.gz
else
  sudo yum install jq -y
  ln -s /usr/bin/jq jvm-performance-benchmarks-main/scripts/jq/jq-linux-aarch64
  #ARM OpenJDK
  #curl -LJO https://github.com/adoptium/temurin21-binaries/releases/download/jdk-21.0.1%2B12/OpenJDK21U-jdk_aarch64_linux_hotspot_21.0.1_12.tar.gz
  curl -LJO https://github.com/adoptium/temurin11-binaries/releases/download/jdk-11.0.21%2B9/OpenJDK11U-jdk_aarch64_linux_hotspot_11.0.21_9.tar.gz
  curl -LJO https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.9%2B9/OpenJDK17U-jdk_aarch64_linux_hotspot_17.0.9_9.tar.gz

  tar -xvf OpenJDK11U-jdk_aarch64_linux_hotspot_11.0.21_9.tar.gz
  tar -xvf OpenJDK17U-jdk_aarch64_linux_hotspot_17.0.9_9.tar.gz
  #tar -xvf OpenJDK21U-jdk_aarch64_linux_hotspot_21.0.1_12.tar.gz
fi

cd jvm-performance-benchmarks-main

sudo ./scripts/shell/configure-os-linux.sh
#Do you want to set CPU core(s) isolation with isolcpus? (yes/no) no
#Do you want to set CPU core(s) isolation with cgroups? (yes/no) yes
#Do you want to disable ASLR? (yes/no) yes
#Do you want to disable turbo boost mode? (yes/no) yes
#Do you want to set the CPU governor to performance? (yes/no) yes
#Do you want to disable CPU hyper-threading? (yes/no) yes

cp settings/config.properties settings/config.properties.bak
export JAVA_HOME="/home/azureuser/jdk-11.0.21+9"
echo "OPENJDK_HOTSPOT_VM_HOME=\"$JAVA_HOME\"" > settings/config.properties
export PATH=$JAVA_HOME/bin:$PATH
./run-benchmarks.sh | tee run-benchmarks_jdk11.out

export JAVA_HOME="/home/azureuser/jdk-17.0.9+9"
echo "OPENJDK_HOTSPOT_VM_HOME=\"$JAVA_HOME\"" > settings/config.properties
export PATH=$JAVA_HOME/bin:$PATH
./run-benchmarks.sh FibonacciBenchmark | tee run-benchmarks_jdk17.out
Enter fullscreen mode Exit fullscreen mode

There are various benchmarks, especially for JVM. I split all test done on 2 categories:

  • CPU, Memory, OpenSSL
  • JVM

JVM tests were especially difficult to compare. There are tests that are faster on one processors than on another but distribution isn’t uniform. So, I’ve decided to do a relative comparison. I’ve selected IceLake numbers as a baseline, and present all data in percent of improvement comparing to baseline. I’ve split all data in 3 categories

  • Tests that have more than 10 % of degradation
  • Tests that are in pair with IceLake (-10% — +10%)
  • Tests that have more than 10% improvement

Performance Results

As we can see Altra processors outperforms IceLake in most of categories, however as we can see Memory throughput and RSA encryption is slower. EPYC processors also do very well

Altra and Epyc performance relative to Ice Lake Link

I’m not sure why Ice Lake shows poor CPU performance sysbench performance comparing to Altra and Epyc processors. I should try with a different VM instance type.

Typical Java application will perform RSA decryptions, however Altra processors appears to be slower than IceLake processors. A new research should be done to compare RSA decryption performance. Even thought RSA verify isn’t the same as RSA decryption, I will use RSA verify performance numbers as a base line in this study.

A few simple options like connection pools on backend or HTTP keepalive on frontend can be used to mitigate performance degradation. In fact, it’s the best practice anyway for any production Java application.

Epyc processors show much better performance than Altra and IceLake in cryptography and in CPU performance.

JVM (JDK11) performance comparison (Altra/IceLake and Epyc/IceLake)
Link

Altra processors has the number of degradations that are less than the number of improvements. In general, it’s hard to tell just based on the number of cases. So, the average percent is below. Epyc processors shows greater performance than IceLake substantially. JDK 11 performance results are aligned with sysbench results.

JVM (JDK11) average percent (Altra/IceLake and Epyc/IceLake)
Link

I also, ran JDK 17 performance comparison. I would expected that newer JDK would have had an increased adoption for ARM architecture and as a result it would show better results on ARM architecture comparing with the same results on x86_64 architecture. However, results didn’t show any significant improvement however improvement is there.

JVM (JDK17) performance comparison (Altra/IceLake and Epyc/IceLake)

JVM (JDK17) average percent (Altra/IceLake and Epyc/IceLake)

Link1 Link2

Improvement, especially, is noticed for Epyc processors.

Cost comparison

VM price difference (Altra/Epyc/Ice Lake)

Summary

This is a limited research that is based on view of how typical cloud application will look like. In this research and it’s methodology, Altra and IceLake processors show comparable performance in general. JVM performance in some cases even faster on Altra processors by 20% on average. Cost is lower on 25% comparing to IceLake processors. However, some cryptography function will be slower and here is a lot depends upon your specific application. Following best practices will help remove or reduce an impact of slower performance.

Epyc processor outperform IceLake and Altra processors in general. JVM performance is in most of cases faster than IceLake and Altra processors. On average, JVM is faster on 40% comparing to IceLake. It’s also 10% cheaper than IceLake processors.

With current focus on cost reduction in cloud industry, Altra is the best choice but it might require to put some effort to tune it for ARM. On the other hand, if just a replacement is required, Epyc processors is the next best thing.

Future improvements

There a several things that I would do differently next time if I tried:

  • I would use Standard_D4d_v5 instead of Standard_D4d_v4. It might have shown different results.
  • I would spin up a sample spring boot application and measure it’s throughput. It could show numbers that can be understood much better than synthetic JVM tests.
  • I would compare JDK 21 and JDK17. JDK 21 was released later and it might have more optimizations for ARM architecture.
  • I would be interested to collect more results or at least link them here for the reference.

P.S.

The cost of the exercise is $68.47. Most of the time was consumed by running JDK 11 & 17 performance tests. Each JVM test take approximately 48h.

Top comments (0)