Saif Ali

Posted on Apr 4 • Originally published at nexusai.run

I replaced my entire GitHub Actions deploy pipeline with one command

#devops #github #deployment #productivity

My GitHub Actions deploy workflow was 87 lines of YAML.

It had grown over 18 months from a clean 20-line file into something I was genuinely afraid to touch. It broke whenever a dependency updated. It had three hardcoded ARNs from an AWS account I was no longer using. It had a comment that said # TODO: fix this that had been there for 11 months.

Last month I deleted all 87 lines and replaced them with one command.

Here's exactly how I did it — and what I learned along the way.

The YAML graveyard

This was my deploy workflow. See if any of this feels familiar:

name: Deploy to Production

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Set up Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run tests
        run: npm test

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build Docker image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/my-app:$IMAGE_TAG .
          docker push $ECR_REGISTRY/my-app:$IMAGE_TAG
          echo "IMAGE=$ECR_REGISTRY/my-app:$IMAGE_TAG" >> $GITHUB_ENV

      - name: Download task definition
        run: |
          aws ecs describe-task-definition --task-definition my-app \
            --query taskDefinition > task-definition.json

      - name: Update ECS task definition
        id: task-def
        uses: aws-actions/amazon-ecs-render-task-definition@v1
        with:
          task-definition: task-definition.json
          container-name: my-app
          image: ${{ env.IMAGE }}

      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: ${{ steps.task-def.outputs.task-definition }}
          service: my-app-service
          cluster: my-app-cluster
          wait-for-service-stability: true

      - name: Notify on failure
        if: failure()
        run: |
          curl -X POST ${{ secrets.SLACK_WEBHOOK_URL }} \
            -H 'Content-type: application/json' \
            --data '{"text":"Deploy failed! Check Actions."}'

I wrote this. I'm not proud of it.

The real problem wasn't the YAML itself. The problem was everything hidden underneath the YAML:

An ECR repository I had to provision manually
An ECS cluster, service, and task definition I had to set up in the console
IAM roles with the exact right permissions (I guessed wrong twice)
A Dockerfile I maintained separately
AWS credentials rotated manually every 90 days

The pipeline was the visible part. The invisible part was 3 days of setup I did 18 months ago that I could no longer remember well enough to recreate.

When a new teammate joined and asked "how does deploy work?" — I sent them the workflow file and said "it's complicated."

That's not an answer. That's a warning sign.

The breaking point

In February, I switched from Node 18 to Node 20. The Docker build broke because my base image was pinned to node:18-alpine in three different places — the Dockerfile, the Actions workflow, and a .nvmrc file I had forgotten existed.

The fix took 45 minutes. The error message was not helpful. I fixed it by diffing my Dockerfile against a Stack Overflow answer from 2023.

Two weeks later, AWS deprecated the amazon-ecs-render-task-definition@v1 action. The pipeline broke silently — it ran, reported success, but the new image never actually deployed. I found out because a user filed a bug for something I had definitely already fixed.

That was the moment I decided: the pipeline is not worth maintaining.

What I tried first

I looked at Render and Railway. Both are good products. Neither deploys into my own AWS account — they provision their own infrastructure. My company has a compliance requirement that customer data stays in a customer-owned AWS environment. So those were out.

I looked at AWS CodePipeline. I wanted to solve complexity, not add more of it.

Then a colleague mentioned NEXUS AI. He described it as "a CLI that handles all the ECS stuff so you don't have to." I was skeptical. That's what everyone says.

The initial deploy

I installed the CLI:

npm install -g @nexusai/cli
nexus login

Then I pointed it at my repo:

nexus deploy source \
  --repo https://github.com/myorg/my-app \
  --name my-app \
  --provider aws_ecs_fargate

I expected this to fail immediately. My expectations for new DevOps tools are calibrated by years of experience.

It didn't fail. Four and a half minutes later I got back a URL. The app was running. The same app. In my AWS account.

I checked the AWS console out of habit. There was an ECS cluster. A task definition. A service. An ECR repository with the image in it. NEXUS AI had provisioned all of it.

I had not written a Dockerfile. I had not configured any IAM roles. I had not touched the AWS console.

I sat with that for a moment.

What actually happens under the hood

Here's what nexus deploy source does, in order:

Reads your repo — detects the runtime from package.json, requirements.txt, go.mod, etc.
Builds the container on NEXUS AI's build infrastructure — not your machine, not a GitHub runner
Pushes the image to an ECR repository it provisions in your account
Creates (or updates) the ECS infrastructure — cluster, task definition, service, load balancer
Issues a TLS certificate via ACM and wires it to the load balancer
Waits for health checks to pass before returning the live URL

Steps 3–5 are the 3 days of manual work I did 18 months ago. They now run in parallel and take about 3 minutes.

The new GitHub Actions workflow

Here's my deploy workflow today:

name: Deploy to Production

on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci
      - run: npm test

  deploy:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - name: Deploy
        run: nexus deploy redeploy --deployment-id ${{ secrets.NEXUSAI_DEPLOYMENT_ID }}
        env:
          NEXUSAI_TOKEN: ${{ secrets.NEXUSAI_TOKEN }}

24 lines, including name: fields and blank lines.

The deploy job has one step. It calls nexus deploy redeploy, which tells NEXUS AI to rebuild from the latest commit and roll it out with a rolling update. No Docker commands. No AWS credentials. No ECR. No ECS task definition wrangling.

I kept the test job. NEXUS AI doesn't replace your test suite — it replaces everything after tests pass.

Secrets and environment variables

Before, I had secrets in three places: GitHub Actions secrets (for the pipeline), AWS Secrets Manager (for the app), and a .env.example file that was always slightly out of date.

Now:

nexus secret set \
  DATABASE_URL=postgres://user:pass@host/db \
  STRIPE_SECRET_KEY=sk_live_... \
  NODE_ENV=production

These are encrypted at rest and injected as environment variables when the container starts. The pipeline only needs NEXUSAI_TOKEN — one secret instead of seven.

After updating secrets, one command applies them:

nexus deploy redeploy --deployment-id <your-deployment-id>

Rollback

Old workflow rollback: figure out the previous image SHA, manually update the ECS task definition, trigger a new deployment, hope the old image hasn't been cleaned up by the ECR lifecycle policy.

New rollback:

nexus deploy rollback --deployment-id <your-deployment-id>

Reverts to the previous container image. Health checks run. Done. I've used this twice. Both times took under 90 seconds.

Redeployment speed

First deploy: ~4.5 minutes (infrastructure provisioning included).

Subsequent deploys: 60–90 seconds. Infrastructure is already provisioned, so it's just build → push → rolling update.

My old pipeline took 10–12 minutes. Most of that was the Docker build running on GitHub's shared runners plus the ECS service stability wait.

What I lost

Every tool has trade-offs. These are the real ones:

Less visibility into the build environment. With a Dockerfile I wrote, I knew exactly what was in the image. With source-based deployment, NEXUS AI generates the image. You can inspect it — nexus deploy logs gives the full build output — but you're not authoring the Dockerfile. For most apps this is fine. If you have specific system dependencies (custom C extensions, obscure shared libraries), test carefully.

The first deploy takes time. Infrastructure provisioning isn't instant. If you need sub-30-second cold deploys for some reason, this isn't that. But once infrastructure exists, redeployments are fast.

You're adding a dependency. NEXUS AI is now in your deploy path. Worth knowing.

The numbers

	Old workflow	New workflow
Lines of YAML	87	24
Pipeline runtime	10–12 min	60–90 sec
AWS console setup	~3 days (one-time)	0
Secrets locations	3	1
Rollback steps	~6 manual steps	1 command
Last random breakage	November	Hasn't happened

How to try it

# Install
npm install -g @nexusai/cli

# Authenticate
nexus login

# First deploy — detects Node/Python/Go automatically, no Dockerfile needed
nexus deploy source \
  --repo https://github.com/your/repo \
  --name my-app \
  --provider aws_ecs_fargate   # or gcp_cloud_run, azure_container_apps

# Check status
nexus deploy status --deployment-id <id>

# Set environment variables
nexus secret set KEY=value KEY2=value2

# Redeploy (use this in CI)
nexus deploy redeploy --deployment-id <id>

# Rollback
nexus deploy rollback --deployment-id <id>

For CI/CD: add NEXUSAI_TOKEN and NEXUSAI_DEPLOYMENT_ID as secrets in your GitHub repo settings, then replace your deploy steps with the one-liner from the workflow above.

Final thought

The 87-line YAML file wasn't the real cost. The real cost was the cognitive overhead of owning it — the 45-minute debugging session when Node versions drifted, the silent failure when an Action was deprecated, the "it's complicated" I sent to a new teammate.

I don't miss any of that.

If you're maintaining a pipeline like the one I had, it's worth spending 20 minutes to find out how much of it you can delete.

Building something and want to compare notes? Drop it in the comments.

DEV Community