DEV Community

Cover image for Vibe coding a GO CLI to extract data from articles

Vibe coding a GO CLI to extract data from articles

Motivation

In this website (Vibe Classes ⚠️ under construction) there is a section called Articles intended to be a collection of recent news about Vibe Coding and related subjects. Every time some interesting article pops up in the media, it will be curated by our team and saved here for reference. The data is stored in a JSON file. An article have the following attributes:

type Article {
  title: string
  description: string
  image: string
  source: string
  slug: string
  publishDate: Date
}
Enter fullscreen mode Exit fullscreen mode

These attributes are intentionally chosen because they are based on the Open Graph Protocol. You may have already noticed that when you paste a link on some social media, after a second or two, it expands displaying an image, a title and summary. This data comes from Open Graph tags that are contained in the source code of the page you posted.
In order to add new articles to this website, every time we need to inspect its source code, find the Open Graph tags, copy and paste the contents to create a new object. And, finally, add it to the array on the JSON file.
The JSON file looks like this:

{
  "articles": [
    {
      "slug": "vibe-code-or-retire",
      "url": "https://www.infoworld.com/article/3960574/vibe-code-or-retire.html",
      "title": "Vibe code or retire",
      "description": "Vibe coding is how we will all write code in the future. Start learning it now, or your career as a software developer will end.",
      "image": "https://www.infoworld.com/wp-content/uploads/2025/04/3960574-0-34945300-1745312518-shutterstock_1724042191.jpg",
      "source": "InfoWorld",
      "publishDate": "2025-04-22"
    }
    // ...
  ]
}
Enter fullscreen mode Exit fullscreen mode

This process is cumbersome and tedious, not to mention that is error prone. Let's fix it with some Vibe Coding. We are going to automate this process, learn something new and have fun. From what we know, the GO language is great for CLI tools. I have pretty basic knowledge of this language so I would take several hours to build something acceptable by myself. So, without further ado, let's get our hands dirty.

Tooling

For this challenge we will use Visual Studio Code and Anthropic Claude (Claude 3.7 Sonnet). Also, Go lang must be installed. I am running Fedora Linux.

Prompt Engineering

If there is one thing Vibe Coding heavily relies on, it is prompt engineering. The results you will get are totally related to the prompts you elaborate. They need to be objective, provide context, clear instructions and detailed description of what you want to get as an answer. We will have a post dedicated to this subject

Requirements

Let's gather the requirements to our problem and build our first prompt.
We start with a link` to an interesting article. Our CLI must fetch the HTML page and parse it, find the specified tags, extract their content, associate them with our predefined article attributes, assemble a JSON object and add it to the array of entries on our database file. Easy! 😅
We are going to extract information from the following tags:

"og:url"           --> url and slug
"og:title"         --> title
"og:description"   --> url
"og:image"         --> image

The slug property will need some processing and there is one property missing: publishDate. There is no standard tag for publishing date and the article may not be published on the same day you will process it. No problem, we will figure out a way to deal with it in our journey.
Our goal is to end with something like this:

add_vibe_article {url} {database file}

Coding

We will start opening Claude on the browser and have its welcome screen, like this:

Claude

Let's start. It will be an iterative process.

  1. Stating the problem:
1. I need a CLI software written in Go.
2. It will receive two arguments and must show usage details if not provided.
3. The first argument will be an URL to a public web page.
4. The second argument is the path to a JSON file.
5. The web page must be fetched and parsed, the content of the following tags must be extracted: "og:url", "og:title", "og:description" and "og:image".
6. Create a JSON object with the keys url, title, description and image. Use the information extracted on the last step as values.
7. Print the object to console, properly formatted and indented.
8. Provide detailed and clear instructions on how to test the solution on local development environment.

In my prompts, I'm structuring them as numbered lists for several reasons. First, it helps Claude process discrete requirements sequentially. Second, it makes it easier to reference specific points in follow-up prompts if needed. Finally, numbered lists help maintain clarity and prevent the AI from missing critical requirements as complexity increases.

In a few seconds, it will start streaming the response. It provided the source code and a document with instructions as we asked. The screen looks like this:

Response

The source code and instructions from the first iteration can be found on my GitHub, here. It looked promising but, as happens very often, there is a bug on the way:

Iteration 1

The error message is very clear. We have two options, directly return the error to Claude or take a spy on the code and try to fix it. I recommend you to try the latter, maybe you will learn something and save some tokens. If it does't go well, Claude will be there to help anyway. I am going to take a look.

package main

import (
 "encoding/json"
 "fmt"
 "io/ioutil"
 "net/http"
 "os"
 "strings"
 "golang.org/x/net/html"
)
// ...

Removed the "strings" import from line 9 and it worked like a charm. This is a good feeling, feels like you're still a software engineer. I followed all the instructions but gave the binary a name different from proposed: add_vibe_article. Time to test.
With no arguments provided, it failed and displayed usage instructions, as required.

Image description

I will make a copy of the original database file on the current working directory (just to pass through the arguments validation for now) and find an interesting article for us to play.

Result 1

Absolutely amazing! We've got four of the four attributes expected, it assembled an object and printed out. Exactly as required.
It is a good practice to perform a few iterations (prompt chaining) instead of trying to achieve your goal with a single prompt (zero shot). This theory will be covered in a post about Prompt Engineering.

It's worth noting that at this stage, we're focusing on the happy path scenario - our tool assumes the article has proper OpenGraph tags and that the URL is accessible. In a production environment, we would need to implement robust error handling for cases like missing tags, network failures, or malformed URLs. For the purposes of this tutorial, we're keeping things simple to focus on the core functionality.

  1. Second iteration

We have a few problems left to solve. Among them, there are two missing attributes: slug and publishDate. Let's work on them, taking slug to begin.
In the context of SEO (Search Engine Optimization), a slug is the part of a URL that identifies a specific page on a website in a readable format. It typically appears at the end of the URL and uses words that describe the page's content, rather than numbers or random characters.
For example, in the URL "https://example.com/blog/what-is-slug", the slug is "what-is-slug". In fact, for displaying the articles as a list on the website, I wouldn't need the slug. But the framework I am using accepts two attributes as index: id or slug. I'll choose slug just because I think it will be more fun. I would be pretty easy to extract if the LLM would process the URL every time. But it will have to create a rule for a piece of software to run.
Let's first recall what a URL is made of. A URL (Uniform Resource Locator) can be broken down into several key parts: scheme, subdomain, domain name, path, and potentially a query string and fragment. We are interested about the path. Let's ask.

1. Adapt the script to extract the slug of the provided URL.
2. Split the URL by slashes and take the last section.
3. Remove query strings and fragments.
4. Create a property called "slug" and add it to the existing set of properties.
5. Provide detailed and clear instructions on how to apply and test the changes.

This time it worked smoothly. The generated code and document can be found here. Something curious happened. We said nothing about printing the results on the console or expected output. It added the feature of having the output as a JSON file. In the instructions, it asks to create a folder called output and a file will be created in there. Fantastic!

Image description

  1. Third iteration

The publishDate! I don't have a clue on how to get the date a web page was published. So, what do we do? We ask for help.

Given an URL of an article, is there a standardized way of extracting the date it was published. If not, any workarounds?

As I imagined, there is no standard. But there is always workarounds. Claude proposed some, like: OpenGraph have a meta tag but not guaranteed to be present, sometimes the date is in the URL structure, like example.com/2023/05/12/article-title and, as last resort, parse the HTML content. It was not needed to ask, it provided the code and documentation implementing the idea. The output can be found here. Easy like that.

Second iteration

  1. Fourth iteration

We have all attributes we need. It's time to add the entry to the database and celebrate.

Having the object created before with the six attributes: url, slug, title, description, image and publishDate. Implement the following:
1. Use the path provided to the output as an existing target JSON file.
2. Create a backup copy of it before manipulation. The file must be named .json.YYYYMMDD.bkp, replace year, month and date accordingly and save on the same folder.
3. The target file follows this structure {"articles":[{...},...]}.
4. The new entry must be appended to the "articles" array.
5. The generated JSON file must be valid.
6. Provide the complete source code.
7. Provide complete and detailed documentation.

It worked nicely, as expected. The outputs can be found here. This time it provided a shell script for testing. Oftentimes you don't just get what you wanted, you get more! All our requirements have been asked. We are ready to the final test. I have compiled the latest version. Let's run it and add an article for real.

Third iteration

And it looked nicely on the website!

Result 3

Conclusion

In this article, we've demonstrated how Vibe Coding can significantly accelerate development tasks by leveraging AI assistants like Claude. What would have taken hours of manual coding, debugging, and testing was accomplished in just a few iterations of prompt engineering and code refinement.

The final result is a practical Go CLI tool that:

  • Extracts OpenGraph metadata from article URLs
  • Creates properly formatted JSON entries with all required fields
  • Safely updates our articles database with appropriate backups
  • Saves significant time compared to manual data entry

This approach showcases several advantages of Vibe Coding:

  1. Rapid Development: We created a functioning tool in minutes rather than hours
  2. Learning Opportunity: Even with minimal Go experience, we built a production-ready solution
  3. Iterative Improvement: Each prompt built upon previous work, refining the solution step by step
  4. Problem Solving: When encountering challenges like extracting publication dates, we leveraged AI to explore alternatives

Go was an excellent choice for this CLI tool for several reasons. First, Go's compilation to a single binary makes distribution simple across different operating systems without dependency headaches. Second, Go's concurrency model with goroutines is perfect for network operations like fetching web pages. Finally, Go's strong standard library provides excellent HTTP client and JSON handling capabilities with minimal external dependencies, making it ideal for command-line utilities.

While traditional development would involve searching documentation, reading Stack Overflow, and experimenting with different approaches, Vibe Coding allowed us to focus on the problem definition and solution requirements. The implementation details were handled collaboratively with Claude.

As software development continues to evolve, tools like this demonstrate how AI assistants can become valuable coding partners, handling the implementation details while developers focus on problem-solving and system architecture.

Next time you find yourself performing repetitive tasks or facing a challenge in an unfamiliar language, consider how Vibe Coding might help you create a custom solution more efficiently.

Image of Stellar post

Check out Episode 1: How a Hackathon Project Became a Web3 Startup 🚀

Ever wondered what it takes to build a web3 startup from scratch? In the Stellar Dev Diaries series, we follow the journey of a team of developers building on the Stellar Network as they go from hackathon win to getting funded and launching on mainnet.

Read more

Top comments (0)

Image of Stellar post

Check out Episode 1: How a Hackathon Project Became a Web3 Startup 🚀

Ever wondered what it takes to build a web3 startup from scratch? In the Stellar Dev Diaries series, we follow the journey of a team of developers building on the Stellar Network as they go from hackathon win to getting funded and launching on mainnet.

Read more