GitHub Actions are commonly used for CI/CD pipelines, but deployments can occasionally fail due to timeout errors. This article provides guidance on how to troubleshoot timeout issues in a Kubernetes environment.
Error: timed out waiting for the condition
Error: Error: The process 'helm3' failed with exit code 1
Error: The process 'helm3' failed with exit code 1
This error typically occurs when GitHub Actions attempts to deploy a service and something goes wrong, but it doesn't provide specific details about the timeout. To identify the root cause, you'll need to use kubectl to investigate what happened during the deployment.
Below are a few steps you can follow to troubleshoot the failure:
Ensure that you are using the correct Kubernetes context before executing any kubectl commands.
$kubectl config use-context test
1. Inspect Container Status Using kubectl describe
-
Get the service container details:
$kubectl get pods --all-namespaces -o wide | grep "service-name" default service-name-488d8 1/1 Running 3 4d7h 10.1.2.2 testnode <none> <none>
-
Get the detailed information about the service resources, status and events:
$kubectl describe pod service-name-488d8 -n default Name: service-name-488d8 Namespace: default Priority: 0 Service Account: service-name Node: testnode/10.1.2.2 Start Time: Mon, 14 Jul 2025 12:41:19 -0500 Labels: app.kubernetes.io/instance=service-name app.kubernetes.io/name=service-name Annotations: prometheus.io/path: /metrics prometheus.io/port: 8000 prometheus.io/scrape: true sidecar.istio.io/inject: false Status: Running IP: 10.1.2.2 IPs: IP: 10.1.2.2 Controlled By: ReplicaSet/service-name-488d8 Containers: service-name: Container ID: docker://f12345678a12345678b12345678c12345678d12345678e12345678abcdabcdab Image: image-repository.com/service-name:latest Image ID: docker-pullable://image-repository.com/service-name@sha256:12345678123456781234567812345678123456781234567812345678abcded12 Ports: 8000/TCP Host Ports: 8000/TCP State: Running Started: Wed, 16 Jul 2025 17:20:20 -0500 Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Tue, 15 Jul 2025 15:14:50 -0500 Finished: Wed, 16 Jul 2025 17:20:19 -0500 Ready: True Restart Count: 2 Limits: memory: 6Gi Requests: memory: 3Gi Liveness: http-get http://:http/health/liveness/ delay=0s timeout=1s period=10s #success=1 #failure=3 Readiness: http-get http://:http/health/readiness/ delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: ENV_HOST_NAME: (v1:spec.nodeName) Mounts: /service-name/conf from service-name-config (rw) /service-name/log from service-name-log (rw) /service-name/cores from service-name-cores (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-abcde (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: service-name-config: Type: HostPath (bare host directory volume) Path: /host/service-name/conf HostPathType: service-name-log: Type: HostPath (bare host directory volume) Path: /host/service-name/log HostPathType: service-name-cores: Type: HostPath (bare host directory volume) Path: /host/service-name/cores HostPathType: kube-api-access-abcde: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: role=server Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Unhealthy 14m (x189 over 3d2h) kubelet Liveness probe failed: Warning Unhealthy 6m38s (x182 over 2d23h) kubelet Readiness probe failed:
Inspect Pod Status
Check the pod's last state, including the reason and exit code. The container might have been OOMKilled, crashed, or restarted for another reason.Review Events
Look for warnings or failures related to scheduling, health checks, or image pulls.Check Health Probes
If liveness or readiness probes are failing, investigate why the health check endpoints are returning unhealthy. This could indicate issues in application startup, configuration, or dependencies.-
Verify Image Pull Status
If the pod is stuck in ImagePullBackOff or ErrImagePull, it might be unable to download the image due to:- Incorrect image reference or missing image in the repository
- Authentication issues
- Large image size causing timeout or resource constraints
-
Verify Environment Configuration
- Incorrect or missing environment variables, configuration files, or file paths can also prevent the application from starting or cause runtime crashes.
2. Review the container logs
$kubectl logs service-name-488d8 --tail=100
- Analyze container logs for error patterns or stack traces.
- If supported, enable additional logging dynamically and monitor for any issues.
3. Crashlooping container
$kubectl get pods --all-namespaces -o wide| grep "service-name" default service-name-488d8 1/1 Running 3 4d7h 10.1.2.2 testnode <none> <none> default service-name-567e9 0/1 CrashLoopBackOff 22 98m 10.1.2.2 testnode <none> <none>One of the containers "service-name-567e9", which was intended to deploy and replace the previous one, is in a CrashLoopBackOff state. This container requires analysis or debugging to determine the root cause. - Debugging a crash-looping container can be challenging. To investigate further, shell into the node where the service is running and modify the container to run in sleep mode. This allows you to access the container for deeper analysis. You can then run tools like gdb to perform debugging.
testnode / # docker ps | grep "service-name" 123456789abc eabcd123456e "/usr/bin/dumb-init …" 17 minutes ago Up 17 minutes k8s_service-name_service-name-b1234abc4-488d8_default_f2a8d5f7-f03f-4944-9cbf-1bf43f2d8881_5 18b1b5f76d02 k8s.gcr.io/pause:3.4.1 "/pause" 5 days ago Up 5 days k8s_POD_service-name-b1234abc4-488d8_default_f2a8d5f7-f03f-4944-9cbf-1bf43f2d8881_0 testnode / # docker exec -it 123456789abc bash testnode:/app [main]$
- Absence of a Persistent Foreground Process: Docker containers need a foreground process to stay running. If the main application process exits, or if the CMD or ENTRYPOINT in the Dockerfile doesn’t keep a process active, the container will automatically stop and restart, which cause crashlooping. Make sure your foreground process is actively running.
Top comments (0)