Tia Zanella

Posted on Jun 19 • Edited on Jul 7

When Cisco Collector written in Go meets simplicity over Python or Ansible

#go #cli #cisco #automation

We constantly hear that Ansible and Python are apparently the only ways to automate networks, today I even listen in a conversation "Python is the industry standard" probably I missed the RFC document or probably the guy was referring to a sales standard, but back to us what happens when the framework, the platform or the software we are using becomes heavier than the problem to solve?

There is a moment where automation becomes necessary, not because we want to look modern, not because every task deserves a framework and not simply because adding automation automatically means we are doing things better. It becomes necessary because repeating the same command collection manually across many devices is slow, risky, boring and almost impossible to diff and validate properly especially under pressure.

For this reason I built the Cisco Go Collector during a real migration activity with a very practical goal: collect configuration and command outputs from Cisco devices in an easily repeatable way, without forcing every colleague involved in the process to become developers or to install an automation stack just to run a super simple flow.

The idea was simple:

define the devices in a CSV which is the comfort zone for everyone
define the commands in the same CSV file, super simple and organized to manage one row per device
run a portable Go binary against that CSV file
collect the outputs in organized text files
archive the result as operational evidence that can be easily diff

That is it!

super lightweight to run
no Python virtual environment
no Ansible playbook structure
no inventory hierarchy
no framework onboarding
no additional runtime or software on corporate managed workstations
just a CSV file and a compiled binary

The automation and AI trap when the solution is heavier than the problem to solve

I love automation and I fully support AI if used the proper way, but we have to find a balance and recognize when to choose one tool over the other and specially one programming language over the other.

In this specific case, I wanted something very fast, lightweight, portable and executable freely in a managed environment.
The immediate thought for many would be: let's write a Python script, it would work, but then you introduce dependencies overhead of security risks, you have to ask users to install the environment, manage virtual environments and deal with pip we all know the story nearly impossible to obtain in managed environments and indeed quite useless to run a simple collector.

Using Ansible for a simple collection task is like using a bazooka to kill a mosquito, frameworks like Netmiko, Nornir, pyATS, and similar ecosystems are excellent when the problem requires state management, reusable workflows, validation logic, abstraction layers, structured inventory, compliance, CI/CD integration, or large-scale governance.

But I think not every operational task requires that level of machinery and complexity overhead, sometimes the real problem is much smaller: I need to simply connect to many devices, run a known list of commands, save the outputs and make the result easy to review among my colleagues as simply as that!
In that scenario, the complexity of the automation environment can become bigger than the task itself and in my opinion team balance matters most then the technology used to achieve the result.

Developing Python scripts or Ansible YAML requires a specific level of knowledge of what we do, whereas using a CSV file is straightforward for everyone, this is especially true in managed corporate environments, where engineers often work on locked-down laptops, jump servers, or controlled workstations, installing Python packages or external dependencies may require approvals. Even when installation is possible, supportability becomes part of the problem:

who owns the runtime?
who updates the dependencies?
who validates the Python version?
who explains the playbook structure to non-developers?
who supports the colleague who only needs to run some migration checks at 02:00 during a change window?

Those questions matter.
Automation is not only about what is technically possible, automation is also about what is operationally usable and make life easy!

So why not just use Ansible?

Ansible is a great tool, for many network automation use cases, it is the graal.

It gives you inventories, variables, modules, playbooks, idempotency, roles, collections and a broad ecosystem. If the goal is configuration management, desired state, multi-step orchestration or repeatable infrastructure workflows, Ansible is a serious candidate.

But for this project, I was not trying to model the network state.

I was not trying to push configuration
I was not trying to build a source-of-truth-driven workflow
I was not trying to create a full orchestration layer
I was not trying to introduce a new automation culture during a migration window

The goal was more direct: let network engineers declare what they want to collect in a CSV, then execute it consistently.
That may sound too simple to be truth, but simplicity is exactly the strength here.

a CSV file can be opened by everyone
it can be reviewed before execution
it can be attached to a change request
it can be versioned in Git
it can be edited by people who do not write code
it can be validated by another engineer before the migration starts and edit super fast in case of need

In this use case, the CSV is not a limitation, it became the operational contract.

The CSV as the human interface

The design starts from a very basic assumption as the input should be understandable by the whole team.
A typical CSV file describes:

datacenter	room	rack	hostname	ip	platform	category	command	target_ip	vrf
DC1	ROOM-A	12	MPX01	10.10.10.100	nx-os	common	show clock
DC1	ROOM-A	12	MPX01	10.10.10.100	nx-os	common	show vlan brief
DC1	ROOM-A	12	MPX01	10.10.10.100	nx-os	common	show pbr static summary
DC1	ROOM-A	12	MPX01	10.10.10.100	nx-os	specialized	show ip route ospf-xxx vrf all
DC1	ROOM-A	12	MPX01	10.10.10.100	nx-os	specialized	show ip ospf neighbor vrf all
DC1	ROOM-B	58	MPX02	10.10.30.100	ios-xe	connectivity	traceroute	10.0.0.1	VRFNAME
DC1	ROOM-B	58	MPX02	10.10.30.100	ios-xe	connectivity	ping	10.0.0.1	VRFNAME
DC1	ROOM-B	60	MPX03	10.10.20.100	nx-os	connectivity	traceroute	1.1.1.1
DC1	ROOM-B	60	MPX03	10.10.20.100	nx-os	connectivity	ping	1.1.1.1

This gives the team a simple way to answer three important questions before execution and act fast on changes:

which devices are involved?
which commands will be executed?
where will the evidence be stored?

That transparency is extremely important during migrations when pressure is high, ambiguity is expensive as like ask someone to develop a solution to achieve this simple result. A deterministic CSV file viceversa is easier to review than a conversation in a chat, easier to validate than a copy-paste checklist and easier to archive than manual terminal history.

Why Go?

Go was a very practical choice. Its value is not that it's fashionable, but that it allows the tool to be distributed as a single compiled executable and it is blazing fast when handling parallel requests.

For this specific use case, that matters more than language elegance, compiled binary is easier to distribute in a locked-down environment. It reduces the need for local runtime preparation and avoids asking every operator to install Python packages, manage virtual environments or align dependency versions.

The operational model becomes simple:

prepare CSV -> run binary -> collect output

That is the whole workflow, for a migration team, this is powerful because the tool becomes almost invisible. People do not need to understand the internal code to use it they only need to understand the CSV and wait for the expected output.

That is the kind of automation I like: not impressive from the outside, but extremely useful when the team is under pressure.

Internal architecture

The project is intentionally small, but not random and available to welcome additional scenarios, the internal logic can be seen as four layers.

CSV input -> Device and command model -> SSH execution engine -> Structured output writer

1. CSV loader

The loader reads the CSV header, validates the required columns and creates a list of devices.

Required fields include:

datacenter, room, zone, hostname, ip, platform, category, command

Optional fields include:

target_ip, vrf

Commands are grouped by hostname, so each device can have multiple commands attached to it this avoids treating every CSV row as an isolated connection task and gives the execution layer a cleaner device-centric model.

2. Platform-aware command builder

Most commands are executed exactly as written if the command is a normal CLI command such as:

show version
show interface status
show ip route
show running-config

the collector sends the command directly to the device, for commands such as ping and traceroute, the tool can generate the correct Cisco syntax using the optional target_ip and vrf fields, this matters because syntax differs between platforms.

For NX-OS, the structure is:

ping 10.10.10.10 vrf management

For IOS-XE, the structure is:

ping vrf management 10.10.10.10

The CSV remains simple, while the tool handles the platform-specific command construction.
That is a small detail, but it is exactly the kind of detail that becomes annoying during operational work the more devices and VRFs involved, the more valuable this becomes.

3. SSH execution engine

The SSH layer opens an interactive shell, requests a pseudo-terminal and disables terminal paging with:

terminal length 0

This is important because command output must be collected completely, without pagination prompts interrupting the stream, the runner then sends each command and reads the output until the device prompt returns.
This approach is simple, but it also requires care, Cisco CLI output is not a clean API response it is terminal text and the collector needs to detect when the command has finished, strip the echoed command, remove the trailing prompt and write only the meaningful output.

The prompt detection logic is intentionally conservative, it looks for Cisco-like prompts ending in > or # at the end of the output, avoiding premature matches in the middle of command text.

Timeout handling is also important if a command hangs or takes too long, the session tries to recover by sending an interrupt, if recovery fails, the session is considered unhealthy and the caller should stop sending more commands to that device.
This is not glamorous code, but it is the kind of defensive logic that makes a small operational tool usable.

4. Structured output writer

The output writer creates a deterministic evidence tree.

Example:

output/
└── 2026-06-19T12-30-00Z/
    └── DC1/
        └── ROOM-A/
            └── CORE/
                └── leaf01/
                    ├── show_version.txt
                    ├── show_interface_status.txt
                    └── specialized/
                        └── ping_vrf_management_10_10_10_10.txt

This is more than cosmetic.
A structured folder tree makes it easier to:

compare pre-check and post-check outputs
share results with colleagues
archive migration data
support troubleshooting and RCA activities
understand which device produced which output

The filenames are sanitized so commands become filesystem-safe text files, again, simple detail, big operational value.

Where AI fits and where it should not

This project also raises an interesting question, in the AI era, why write a deterministic command collector at all can we ask AI to connect directly to the devices?

Answer is NO, AI and automations solve different problems.

AI is excellent for:

helping design the tool
reviewing code
explaining trade-offs
generating documentation
creating test cases
suggesting edge cases
helping non-developers understand the workflow
converting operational notes into structured CSV drafts
summarizing collected outputs after execution

But I would be very careful about using AI as the direct execution layer for network changes or production command collection without strict boundaries.

During a migration, I do not want a probabilistic system deciding what to execute on a device and potentially drift or reprocess with the risk of altering the data output.

I want deterministic input
I want human review
I want predictable execution
I want repeatable output that happen blazing fast at zero cost
I want an artifact that can be stored and audited later

This is where a simple program has play a strategic role, the CSV file defines intent in a deterministic way, the Go binary executes exactly what was declared, the output folder stores the evidence.
AI can help around the workflow or to answer quickly some question or clarify doubts later, but it should not blur the execution contract.

In other words AI is useful as an assistant, the automation engine must remain deterministic, this distinction is very important for me.
AI can help us move faster, but it should not make operational execution less explainable.

Simplicity is not the opposite of engineering

There is a common mistake in technical environments: assuming that simple tools are less serious. I believe the opposite is true, a simple tool can be the result of very deliberate engineering decisions.

In this case, the decisions were:

use CSV because the team can review it
use Go because the result can be distributed as a binary
use SSH because the devices already expose CLI access
write text files because they are easy to archive and compare
keep the scope narrow because the goal is operational reliability
avoid framework dependency because the environment is managed
avoid AI-driven execution because the task requires determinism

This approach is not anti-framework, anti-AI, or anti-Python. It is simply about choosing the right level of abstraction for the specific problem at hand.

When I would still choose Ansible, Python or a framework

I would not use Cisco Go Collector for everything. I would still choose Ansible, Python, Nornir, pyATS, or a comprehensive framework when I need:

complex workflows
configuration deployment
idempotency
reusable automation roles
source-of-truth integration
structured parsing
compliance checks
CI/CD pipelines
test-driven network validation
multi-vendor abstraction
API-first automation
long-term automation platform governance

Those are valid, essential use cases. But for quick, repeatable, human-reviewable command collection during a migration window, a small deterministic collector is a much better operational fit.

A simple decision model

The conversation should not be "framework vs. no framework." the real question is: what is the smallest reliable automation model that solves the operational problem without creating unnecessary friction?

Here is a practical decision matrix:

Scenario	Better fit
One-time or repeated command collection	Small collector
Migration pre/post checks	Small collector
Evidence capture for RCA	Small collector
Non-developer operators involved	Small collector
Locked-down workstation environment	Small collector
Desired-state configuration management	Ansible / framework
Complex orchestration	Ansible / Nornir / Python
Structured validation and parsing	pyATS / Python
Platform governance and reusable roles	Automation platform
Natural language explanation and documentation	AI assistant
Production execution decisions	Deterministic automation

This is not a competition between tools, it is a matter of operational fit.

The strategic role of boring automation

The more I work in complex infrastructure environments, the more I appreciate boring automation.

Boring automation is predictable, it is reviewable, it is easy to explain, it survives change windows and it does not require everyone on the team to learn a new ecosystem. Sometimes, boring automation is exactly what allows a team to move faster.

Cisco Go Collector is intentionally small. It does not try to become a super big project, it simply solves a real operational problem: run declared Cisco CLI commands in an easy way and collect the outputs in an organized way.

That simplicity is not a weakness, it is the core feature.

Love to hear from your experience

I believe the future of infrastructure automation will not only be about bigger platforms, AI agents, or higher abstraction. Those things will matter, but there will also be an increasing need for small, deterministic, portable tools that solve specific operational problems with zero friction.

Especially in large corporate environments, the best technical solution is not always the most sophisticated one. The best automation is the one your colleagues can actually run, understand, review, and trust.

That is where simplicity becomes strategic.
What is your favorite piece of "boring automation"? Let me know in the comments!

Ciao from Italy
Tia

DEV Community