DEV Community

Tia Zanella
Tia Zanella

Posted on

When automation meets simplicity over Python or Ansible

We constantly hear that Ansible and Python are apparently the only ways to automate networks, today I even listen in a conversation "Python is the industry standard" probably I missed the RFC document or probably the guy was referring to a sales standard, but back to us what happens when the framework, the platform or the software we are using becomes heavier than the problem to solve?

There is a moment where automation becomes necessary, not because we want to look modern, not because every task deserves a framework and not simply because adding automation automatically means we are doing things better. It becomes necessary because repeating the same command collection manually across many devices is slow, risky, boring and almost impossible to diff and validate properly especially under pressure.

For this reason I built the Cisco Go Collector during a real migration activity with a very practical goal: collect configuration and command outputs from Cisco devices in an easily repeatable way, without forcing every colleague involved in the process to become developers or to install an automation stack just to run a super simple flow.

The idea was simple:

  1. define the devices in a CSV which is the comfort zone for everyone
  2. define the commands in the same CSV file, super simple and organized to manage one row per device
  3. run a portable Go binary against that CSV file
  4. collect the outputs in organized text files
  5. archive the result as operational evidence that can be easily diff

That is it!

  • super lightweight to run
  • no Python virtual environment
  • no Ansible playbook structure
  • no inventory hierarchy
  • no framework onboarding
  • no additional runtime or software on corporate managed workstations
  • just a CSV file and a compiled binary

The automation and AI trap when the solution is heavier than the problem to solve

I love automation and I fully support AI if used the proper way, but we have to find a balance and recognize when to choose one tool over the other and specially one programming language over the other.

In this specific case, I wanted something very fast, lightweight, portable and executable freely in a managed environment.
The immediate thought for many would be: let's write a Python script, it would work, but then you introduce dependencies overhead of security risks, you have to ask users to install the environment, manage virtual environments and deal with pip we all know the story nearly impossible to obtain in managed environments and indeed quite useless to run a simple collector.

Using Ansible for a simple collection task is like using a bazooka to kill a mosquito, frameworks like Netmiko, Nornir, pyATS, and similar ecosystems are excellent when the problem requires state management, reusable workflows, validation logic, abstraction layers, structured inventory, compliance, CI/CD integration, or large-scale governance.

But I think not every operational task requires that level of machinery and complexity overhead, sometimes the real problem is much smaller: I need to simply connect to many devices, run a known list of commands, save the outputs and make the result easy to review among my colleagues as simply as that!
In that scenario, the complexity of the automation environment can become bigger than the task itself and in my opinion team balance matters most then the technology used to achieve the result.

Developing Python scripts or Ansible YAML requires a specific level of knowledge of what we do, whereas using a CSV file is straightforward for everyone, this is especially true in managed corporate environments, where engineers often work on locked-down laptops, jump servers, or controlled workstations, installing Python packages or external dependencies may require approvals. Even when installation is possible, supportability becomes part of the problem:

  • who owns the runtime?
  • who updates the dependencies?
  • who validates the Python version?
  • who explains the playbook structure to non-developers?
  • who supports the colleague who only needs to run some migration checks at 02:00 during a change window?

Those questions matter.
Automation is not only about what is technically possible, automation is also about what is operationally usable and make life easy!

So why not just use Ansible?

Ansible is a great tool, for many network automation use cases, it is the graal.

It gives you inventories, variables, modules, playbooks, idempotency, roles, collections and a broad ecosystem. If the goal is configuration management, desired state, multi-step orchestration or repeatable infrastructure workflows, Ansible is a serious candidate.

But for this project, I was not trying to model the network state.

  • I was not trying to push configuration
  • I was not trying to build a source-of-truth-driven workflow
  • I was not trying to create a full orchestration layer
  • I was not trying to introduce a new automation culture during a migration window

The goal was more direct: let network engineers declare what they want to collect in a CSV, then execute it consistently.
That may sound too simple to be truth, but simplicity is exactly the strength here.

  • a CSV file can be opened by everyone
  • it can be reviewed before execution
  • it can be attached to a change request
  • it can be versioned in Git
  • it can be edited by people who do not write code
  • it can be validated by another engineer before the migration starts and edit super fast in case of need

In this use case, the CSV is not a limitation, it became the operational contract.

The CSV as the human interface

The design starts from a very basic assumption as the input should be understandable by the whole team.
A typical CSV file describes:

datacenter room rack hostname ip platform category command target_ip vrf
DC1 ROOM-A 12 MPX01 10.10.10.100 nx-os common show clock
DC1 ROOM-A 12 MPX01 10.10.10.100 nx-os common show vlan brief
DC1 ROOM-A 12 MPX01 10.10.10.100 nx-os common show pbr static summary
DC1 ROOM-A 12 MPX01 10.10.10.100 nx-os specialized show ip route ospf-xxx vrf all
DC1 ROOM-A 12 MPX01 10.10.10.100 nx-os specialized show ip ospf neighbor vrf all
DC1 ROOM-B 58 MPX02 10.10.30.100 ios-xe connectivity traceroute 10.0.0.1 VRFNAME
DC1 ROOM-B 58 MPX02 10.10.30.100 ios-xe connectivity ping 10.0.0.1 VRFNAME
DC1 ROOM-B 60 MPX03 10.10.20.100 nx-os connectivity traceroute 1.1.1.1
DC1 ROOM-B 60 MPX03 10.10.20.100 nx-os connectivity ping 1.1.1.1

This gives the team a simple way to answer three important questions before execution and act fast on changes:

  1. which devices are involved?
  2. which commands will be executed?
  3. where will the evidence be stored?

That transparency is extremely important during migrations when pressure is high, ambiguity is expensive as like ask someone to develop a solution to achieve this simple result. A deterministic CSV file viceversa is easier to review than a conversation in a chat, easier to validate than a copy-paste checklist and easier to archive than manual terminal history.

Why Go?

Go was a very practical choice. Its value is not that it's fashionable, but that it allows the tool to be distributed as a single compiled executable and it is blazing fast when handling parallel requests.

For this specific use case, that matters more than language elegance, compiled binary is easier to distribute in a locked-down environment. It reduces the need for local runtime preparation and avoids asking every operator to install Python packages, manage virtual environments or align dependency versions.

The operational model becomes simple:

prepare CSV -> run binary -> collect output
Enter fullscreen mode Exit fullscreen mode

That is the whole workflow, for a migration team, this is powerful because the tool becomes almost invisible. People do not need to understand the internal code to use it they only need to understand the CSV and wait for the expected output.

That is the kind of automation I like: not impressive from the outside, but extremely useful when the team is under pressure.

Internal architecture

The project is intentionally small, but not random and available to welcome additional scenarios, the internal logic can be seen as four layers.

CSV input -> Device and command model -> SSH execution engine -> Structured output writer
Enter fullscreen mode Exit fullscreen mode

1. CSV loader

The loader reads the CSV header, validates the required columns and creates a list of devices.

Required fields include:

datacenter, room, zone, hostname, ip, platform, category, command
Enter fullscreen mode Exit fullscreen mode

Optional fields include:

target_ip, vrf
Enter fullscreen mode Exit fullscreen mode

Commands are grouped by hostname, so each device can have multiple commands attached to it this avoids treating every CSV row as an isolated connection task and gives the execution layer a cleaner device-centric model.

2. Platform-aware command builder

Most commands are executed exactly as written if the command is a normal CLI command such as:

show version
show interface status
show ip route
show running-config
Enter fullscreen mode Exit fullscreen mode

the collector sends the command directly to the device, for commands such as ping and traceroute, the tool can generate the correct Cisco syntax using the optional target_ip and vrf fields, this matters because syntax differs between platforms.

For NX-OS, the structure is:

ping 10.10.10.10 vrf management
Enter fullscreen mode Exit fullscreen mode

For IOS-XE, the structure is:

ping vrf management 10.10.10.10
Enter fullscreen mode Exit fullscreen mode

The CSV remains simple, while the tool handles the platform-specific command construction.
That is a small detail, but it is exactly the kind of detail that becomes annoying during operational work the more devices and VRFs involved, the more valuable this becomes.

3. SSH execution engine

The SSH layer opens an interactive shell, requests a pseudo-terminal and disables terminal paging with:

terminal length 0
Enter fullscreen mode Exit fullscreen mode

This is important because command output must be collected completely, without pagination prompts interrupting the stream, the runner then sends each command and reads the output until the device prompt returns.
This approach is simple, but it also requires care, Cisco CLI output is not a clean API response it is terminal text and the collector needs to detect when the command has finished, strip the echoed command, remove the trailing prompt and write only the meaningful output.

The prompt detection logic is intentionally conservative, it looks for Cisco-like prompts ending in > or # at the end of the output, avoiding premature matches in the middle of command text.

Timeout handling is also important if a command hangs or takes too long, the session tries to recover by sending an interrupt, if recovery fails, the session is considered unhealthy and the caller should stop sending more commands to that device.
This is not glamorous code, but it is the kind of defensive logic that makes a small operational tool usable.

4. Structured output writer

The output writer creates a deterministic evidence tree.

Example:

output/
└── 2026-06-19T12-30-00Z/
    └── DC1/
        └── ROOM-A/
            └── CORE/
                └── leaf01/
                    ├── show_version.txt
                    ├── show_interface_status.txt
                    └── specialized/
                        └── ping_vrf_management_10_10_10_10.txt
Enter fullscreen mode Exit fullscreen mode

This is more than cosmetic.
A structured folder tree makes it easier to:

  • compare pre-check and post-check outputs
  • share results with colleagues
  • archive migration data
  • support troubleshooting and RCA activities
  • understand which device produced which output

The filenames are sanitized so commands become filesystem-safe text files, again, simple detail, big operational value.

Where AI fits and where it should not

This project also raises an interesting question, in the AI era, why write a deterministic command collector at all can we ask AI to connect directly to the devices?

Answer is NO, AI and automations solve different problems.

AI is excellent for:

  • helping design the tool
  • reviewing code
  • explaining trade-offs
  • generating documentation
  • creating test cases
  • suggesting edge cases
  • helping non-developers understand the workflow
  • converting operational notes into structured CSV drafts
  • summarizing collected outputs after execution

But I would be very careful about using AI as the direct execution layer for network changes or production command collection without strict boundaries.

During a migration, I do not want a probabilistic system deciding what to execute on a device and potentially drift or reprocess with the risk of altering the data output.

  • I want deterministic input
  • I want human review
  • I want predictable execution
  • I want repeatable output that happen blazing fast at zero cost
  • I want an artifact that can be stored and audited later

This is where a simple program has play a strategic role, the CSV file defines intent in a deterministic way, the Go binary executes exactly what was declared, the output folder stores the evidence.
AI can help around the workflow or to answer quickly some question or clarify doubts later, but it should not blur the execution contract.

In other words AI is useful as an assistant, the automation engine must remain deterministic, this distinction is very important for me.
AI can help us move faster, but it should not make operational execution less explainable.

Simplicity is not the opposite of engineering

There is a common mistake in technical environments: assuming that simple tools are less serious. I believe the opposite is true, a simple tool can be the result of very deliberate engineering decisions.

In this case, the decisions were:

  • use CSV because the team can review it
  • use Go because the result can be distributed as a binary
  • use SSH because the devices already expose CLI access
  • write text files because they are easy to archive and compare
  • keep the scope narrow because the goal is operational reliability
  • avoid framework dependency because the environment is managed
  • avoid AI-driven execution because the task requires determinism

This approach is not anti-framework, anti-AI, or anti-Python. It is simply about choosing the right level of abstraction for the specific problem at hand.

When I would still choose Ansible, Python or a framework

I would not use Cisco Go Collector for everything. I would still choose Ansible, Python, Nornir, pyATS, or a comprehensive framework when I need:

  • complex workflows
  • configuration deployment
  • idempotency
  • reusable automation roles
  • source-of-truth integration
  • structured parsing
  • compliance checks
  • CI/CD pipelines
  • test-driven network validation
  • multi-vendor abstraction
  • API-first automation
  • long-term automation platform governance

Those are valid, essential use cases. But for quick, repeatable, human-reviewable command collection during a migration window, a small deterministic collector is a much better operational fit.

A simple decision model

The conversation should not be "framework vs. no framework." the real question is: what is the smallest reliable automation model that solves the operational problem without creating unnecessary friction?

Here is a practical decision matrix:

Scenario Better fit
One-time or repeated command collection Small collector
Migration pre/post checks Small collector
Evidence capture for RCA Small collector
Non-developer operators involved Small collector
Locked-down workstation environment Small collector
Desired-state configuration management Ansible / framework
Complex orchestration Ansible / Nornir / Python
Structured validation and parsing pyATS / Python
Platform governance and reusable roles Automation platform
Natural language explanation and documentation AI assistant
Production execution decisions Deterministic automation

This is not a competition between tools, it is a matter of operational fit.

The strategic role of boring automation

The more I work in complex infrastructure environments, the more I appreciate boring automation.

Boring automation is predictable, it is reviewable, it is easy to explain, it survives change windows and it does not require everyone on the team to learn a new ecosystem. Sometimes, boring automation is exactly what allows a team to move faster.

Cisco Go Collector is intentionally small. It does not try to become a super big project, it simply solves a real operational problem: run declared Cisco CLI commands in an easy way and collect the outputs in an organized way.

That simplicity is not a weakness, it is the core feature.

Love to hear from your experience

I believe the future of infrastructure automation will not only be about bigger platforms, AI agents, or higher abstraction. Those things will matter, but there will also be an increasing need for small, deterministic, portable tools that solve specific operational problems with zero friction.

Especially in large corporate environments, the best technical solution is not always the most sophisticated one. The best automation is the one your colleagues can actually run, understand, review, and trust.

That is where simplicity becomes strategic.
What is your favorite piece of "boring automation"? Let me know in the comments!

Ciao from Italy
Tia

Top comments (0)