GitHub Actions Deployment Fails with Timeout: How to Troubleshoot

#kubernetes #githubactions #microservices #docker

GitHub Actions are commonly used for CI/CD pipelines, but deployments can occasionally fail due to timeout errors. This article provides guidance on how to troubleshoot timeout issues in a Kubernetes environment.

Error: timed out waiting for the condition Error: Error: The process 'helm3' failed with exit code 1 Error: The process 'helm3' failed with exit code 1

This error typically occurs when GitHub Actions attempts to deploy a service and something goes wrong, but it doesn't provide specific details about the timeout. To identify the root cause, you'll need to use kubectl to investigate what happened during the deployment.

Below are a few steps you can follow to troubleshoot the failure:

Ensure that you are using the correct Kubernetes context before executing any kubectl commands.

$kubectl config use-context test

1. Inspect Container Status Using kubectl describe

Get the service container details:

$kubectl get pods --all-namespaces -o wide | grep "service-name"
default                      service-name-488d8                    1/1     Running             3          4d7h     10.1.2.2     testnode    <none>           <none>

Get the detailed information about the service resources, status and events:

$kubectl describe pod service-name-488d8 -n default
Name:             service-name-488d8
Namespace:        default
Priority:         0
Service Account:  service-name
Node:             testnode/10.1.2.2
Start Time:       Mon, 14 Jul 2025 12:41:19 -0500
Labels:           app.kubernetes.io/instance=service-name
              app.kubernetes.io/name=service-name
Annotations:      prometheus.io/path: /metrics
              prometheus.io/port: 8000
              prometheus.io/scrape: true
              sidecar.istio.io/inject: false
Status:           Running
IP:               10.1.2.2
IPs:
IP:           10.1.2.2
Controlled By:  ReplicaSet/service-name-488d8
Containers:
service-name:
Container ID:   docker://f12345678a12345678b12345678c12345678d12345678e12345678abcdabcdab
Image:          image-repository.com/service-name:latest
Image ID:       docker-pullable://image-repository.com/service-name@sha256:12345678123456781234567812345678123456781234567812345678abcded12
Ports:          8000/TCP
Host Ports:     8000/TCP
State:          Running
  Started:      Wed, 16 Jul 2025 17:20:20 -0500
Last State:     Terminated
  Reason:       OOMKilled
  Exit Code:    137
  Started:      Tue, 15 Jul 2025 15:14:50 -0500
  Finished:     Wed, 16 Jul 2025 17:20:19 -0500
Ready:          True
Restart Count:  2
Limits:
  memory:  6Gi
Requests:
  memory:   3Gi
Liveness:   http-get http://:http/health/liveness/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness:  http-get http://:http/health/readiness/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
  ENV_HOST_NAME:                             (v1:spec.nodeName)
Mounts:
  /service-name/conf from service-name-config (rw)
  /service-name/log from service-name-log (rw)
  /service-name/cores from service-name-cores (rw)
  /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-abcde (ro)
Conditions:
Type              Status
Initialized       True 
Ready             True 
ContainersReady   True 
PodScheduled      True 
Volumes:
service-name-config:
Type:          HostPath (bare host directory volume)
Path:          /host/service-name/conf
HostPathType:  
service-name-log:
Type:          HostPath (bare host directory volume)
Path:          /host/service-name/log
HostPathType:
service-name-cores:
Type:          HostPath (bare host directory volume)
Path:          /host/service-name/cores
HostPathType:  
kube-api-access-abcde:
Type:                    Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds:  3607
ConfigMapName:           kube-root-ca.crt
ConfigMapOptional:       
DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              role=server
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                         node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type     Reason     Age                      From     Message
----     ------     ----                     ----     -------
Warning  Unhealthy  14m (x189 over 3d2h)     kubelet  Liveness probe failed:
Warning  Unhealthy  6m38s (x182 over 2d23h)  kubelet  Readiness probe failed:

Inspect Pod Status
Check the pod's last state, including the reason and exit code. The container might have been OOMKilled, crashed, or restarted for another reason.
Review Events
Look for warnings or failures related to scheduling, health checks, or image pulls.
Check Health Probes
If liveness or readiness probes are failing, investigate why the health check endpoints are returning unhealthy. This could indicate issues in application startup, configuration, or dependencies.
Verify Image Pull Status
If the pod is stuck in ImagePullBackOff or ErrImagePull, it might be unable to download the image due to:
- Incorrect image reference or missing image in the repository
- Authentication issues
- Large image size causing timeout or resource constraints
Verify Environment Configuration
- Incorrect or missing environment variables, configuration files, or file paths can also prevent the application from starting or cause runtime crashes.

2. Review the container logs

$kubectl logs service-name-488d8 --tail=100

Analyze container logs for error patterns or stack traces.
If supported, enable additional logging dynamically and monitor for any issues.

3. Crashlooping container

$kubectl get pods --all-namespaces -o wide| grep "service-name"
default                      service-name-488d8                    1/1        Running             3          4d7h     10.1.2.2     testnode    <none>           <none>
default                      service-name-567e9                    0/1        CrashLoopBackOff    22         98m      10.1.2.2     testnode    <none>           <none>

One of the containers "service-name-567e9", which was intended to deploy and replace the previous one, is in a CrashLoopBackOff state. This container requires analysis or debugging to determine the root cause. - Debugging a crash-looping container can be challenging. To investigate further, shell into the node where the service is running and modify the container to run in sleep mode. This allows you to access the container for deeper analysis. You can then run tools like gdb to perform debugging.

testnode / # docker ps | grep "service-name"
123456789abc   eabcd123456e                                                     "/usr/bin/dumb-init …"    17 minutes ago   Up 17 minutes             k8s_service-name_service-name-b1234abc4-488d8_default_f2a8d5f7-f03f-4944-9cbf-1bf43f2d8881_5
18b1b5f76d02   k8s.gcr.io/pause:3.4.1                                           "/pause"                  5 days ago       Up 5 days                 k8s_POD_service-name-b1234abc4-488d8_default_f2a8d5f7-f03f-4944-9cbf-1bf43f2d8881_0
testnode / # docker exec -it 123456789abc bash
testnode:/app [main]$

Absence of a Persistent Foreground Process: Docker containers need a foreground process to stay running. If the main application process exits, or if the CMD or ENTRYPOINT in the Dockerfile doesn’t keep a process active, the container will automatically stop and restart, which cause crashlooping. Make sure your foreground process is actively running.

DEV Community

GitHub Actions Deployment Fails with Timeout: How to Troubleshoot

1. Inspect Container Status Using kubectl describe

2. Review the container logs

3. Crashlooping container

Top comments (0)