Maven build in docker taking too much time?

#cicdpipeline #java #spring #maven

Recently, we containerized a Spring boot application and maven build process was also a part of this containerization. It has been integrated with Gitlab CI/CD and the runner for CI/CD was a dedicated host. After deploying, lots of complaints were coming from dev team that the maven build in CI/CD process is taking too much time. So, how did we reduce the time for the build process and saved 92% time for the build process?

Initial dockerfile was like

FROM maven:3-amazoncorretto-8 AS builder
WORKDIR /app
COPY . .
RUN mvn clean package -DskipTests

FROM amazoncorretto:8-alpine3.16-jdk
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
CMD ["java", "-jar", "app.jar"]

The first thing came in mind is obviously taking advantages of docker layers. If you don’t know how docker layers work, here is the doc. https://docs.docker.com/build/guide/layers/. You have to think about the docker layers every time you want to build an efficient dockerfile. So, at first we just copied the pom.xml file and ran mvn dependency:go-offline. That resolves all project dependencies, including plugins and reports and their dependencies. Then we copied the codebase and built the project. What’s the advantage? Do you remember that docker caches all the layers and docker only starts executing from the line where it detects a change? So, the pom.xml file rarely gets changed in maximum projects and basically the deployment of codebase change is much important.

FROM maven:3-amazoncorretto-8 AS builder
WORKDIR /app
COPY ./pom.xml ./
RUN mvn dependency:go-offline
COPY . .
RUN mvn clean package -DskipTests

FROM amazoncorretto:8-alpine3.16-jdk
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
CMD ["java", "-jar", "app.jar"]

Here basically unless the pom.xml file gets changed, this layer and layer that run mvn depency:go-offline will be revived from the cache.

This improved the build time a lot. Thanks to docker layers and caching mechanism. But we faced three problems here.

What happens when the pom.xml file gets changed? Suppose, one new dependency has been introduced. Docker detects the change and it will not get the layer from cache rather download all the dependencies again even if we downloaded maximum dependencies before.
That specific project has some complex build mechanisms and some local jar dependencies. So, mvn dependency:go-offline fails and only 80% of the dependencies get cached. Other 20% of dependencies get downloaded all the time the pipeline triggers.
We have a dedicated host used as the CI/CD runner or build agent. Sometimes we have to run docker system prune to make space. So, when we clean the images, the cache goes away and it downloads all the dependencies again.

The RUN command supports a specialized cache, which we can use when we need a more fine-grained cache between runs. Here, we don't always need to fetch all of the dependencies from the internet each time. We only need the ones that have changed.

To solve this problem, we can use RUN --mount type=cache. We are being to able to use this mechanism since we have a dedicated host CI/CD. Due to this, we can take the advantage of host dependency of this cache. So, Our final dockerfile looks like this.

FROM maven:3-amazoncorretto-8 AS builder
WORKDIR /app
COPY . .
RUN --mount=type=cache,target=/root/.m2,rw mvn clean package -DskipTests

FROM amazoncorretto:8-alpine3.16-jdk
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
CMD ["java", "-jar", "app.jar"]

Now, we are getting the benefits of previously downloaded dependencies in Docker environment too!!

DEV Community

Maven build in docker taking too much time?

Top comments (0)