DEV Community

Iurii Panarin
Iurii Panarin

Posted on

Fitter - Open Source no-code tool for map-reduce data from different source and even more!

Hello everyone!

I am Pxyup and today i want represent for you by open source project Fitter.

GitHub logo PxyUp / fitter

New way for collect information from the API's/Websites

Fitter - new way for collect information from the API's/Websites

Fitter CLI - small cli command which provide result from Fitter for test/debug/home usage

Fitter Lib - library which provide functional of fitter CLI as a library

Way to collect information

  1. Server - parsing response from some API's or http request(usage of http.Client)
  2. Browser - emulate real browser using chromium + docker + playwright/cypress and get DOM information
  3. Static - parsing static string as data

Format which can be parsed

  1. JSON - parsing JSON to get specific information
  2. XML - parsing xml tree to get specific information
  3. HTML - parsing dom tree to get specific information
  4. XPath - parsing dom tree to get specific information but by xpath

Use like a library

go get github.com/PxyUp/fitter
Enter fullscreen mode Exit fullscreen mode
package main
import (
    "fmt"
    "github.com/PxyUp/fitter/lib"
    "github.com/PxyUp/fitter/pkg/config"
    "log"
    "net/http"
)

func main() {
    res, err := lib.Parse
…
Enter fullscreen mode Exit fullscreen mode

How it was created

In 2023, I worked on an idea called Trip Searcher:

1.  You enter a budget.
2.  You specify a starting city or country.
3.  You set the trip duration and a range of possible start and end dates.
Enter fullscreen mode Exit fullscreen mode

The Trip Searcher would monitor flights and return potential routes from the starting city, including total prices to various destinations, and send notifications to Telegram with:

1.  Flight costs (parsed from Google or Kiwi).
2.  Hotel prices (from Airbnb or Booking).
3.  Food costs (retrieved from Numbeo).
Enter fullscreen mode Exit fullscreen mode

For this setup, I needed a list of countries, cities, and airport codes to plug into the sites mentioned. During development, I started thinking about how convenient it would be if all this information could be easily combined and parsed to streamline requests, which led to the idea for a project I call Fitter.

P.S.: This project was for personal use.

Fitter CLI

A no-code map-reducer that returns data in user-friendly (JSON) or custom formats, suitable for storage in a database or transmission via HTTP.

Features:

  1. Supports parsing through HTML (query), JSON (gjson), XML, and xpath parsers.
  2. Retrieves data as a browser would, using Docker, Playwright(+ stealth mode), HTTP Client, Cache, File, or propagated fields, with support for custom plugins.
  3. Provides proxy support for Playwright and HTTP clients.
  4. Can send or store information to a file, webhook, console, and more via plugins.
  5. Handles all data types: int, float, bool, array, object, null, and string.
  6. Combines (map-reduce) and transforms fields.
  7. Utilizes the powerful expr library for template syntax, which is available across the application.
  8. Offered as a standalone binary and Docker version.
  9. Allows limits on request counts or instances for browser/Docker usage.

Examples

Static generation:

Here we will just generate static array from hardcoded

./fitter_cli_v1.0.18-darwin-amd64 --url=https://raw.githubusercontent.com/PxyUp/fitter/refs/heads/master/examples/cli/config_static_connector.json
Enter fullscreen mode Exit fullscreen mode
[
        "PAGE: 1 INDEX: 0",
        "PAGE: 2 INDEX: 1",
        "PAGE: 3 INDEX: 2",
        "PAGE: 4 INDEX: 3",
        "PAGE: 5 INDEX: 4"
]
Enter fullscreen mode Exit fullscreen mode

Get current time

Get information from the website and return to user.

./fitter_cli_v1.0.18-darwin-amd64 --url=https://raw.githubusercontent.com/PxyUp/fitter/refs/heads/master/examples/cli/config_current_time.json
Enter fullscreen mode Exit fullscreen mode
"Current time is: 19:18:51"
Enter fullscreen mode Exit fullscreen mode

Get current Steam Sales

That will create sales.md wile in provided directory

Get best news from HackerNews + Comment list for each

In this config we using template syntax for propagate result from the first request to next one.

./fitter_cli_v1.0.18-darwin-amd64 --url=https://raw.githubusercontent.com/PxyUp/fitter/refs/heads/master/examples/cli/config_cli.json
Enter fullscreen mode Exit fullscreen mode
{
  "internal_url": "https://news.ycombinator.com/item?id=41975047",
  "content": {
    "kids": [
      {
        "response_id": "9db6ec73-3e35-4d06-b5eb-dbaa05a3a31d",
        "internal_url": "https://news.ycombinator.com/item?id=41975981",
        "content": {
          "text": "> We describe Flock as "Flutter+". In other words, we do not want, or intend, to fork the Flutter community. Flock will remain constantly up to date with Flutter.<p>That was the first fear when I saw the title - splitting community and having two incompatible versions. Good to see it addressed in the post.<p>The second was just a fear of how it would complicate the development process, but it seems to be a drop-in replacement (just configuring FVM - Flutter Version Manager):<p><pre><code>   Configure .fvmrc to use Flock:   {     "flutter": "master",     "flutterUrl": "https://github.com/Flutter-Foundation/flutter.git"   }</code></pre>Flutter is the best thing that happened to UI development since Qt. Most people don't realize how many apps written in Flutter they use daily, simply because it's impossible to tell. And the frustration described in the post is felt by many CTOs and developers. Especially those who use Flutter for desktop and web. Flutter provides an amazing experience for desktop apps, and precisely because of that, it feels so frustrating when you stumble upon some stupid bug that has been open for a year or two and never gets prioritized. Usually, it's nothing critical, but still requires workarounds and wasting time.<p>I don't know, the idea of Flock sounds good, the main question is engaging the community. Hopefully, the author (who seem to be an ex-Flutter team member himself) have a good grasp on the state of the community.<p>Wishing luck to the project and going to keep an eye on the progress.",
          "title": "comment"
        },
        "id": 41975981
      },
      {
        "content": {
          "text": "Back when I worked on GWT, we had trouble accepting outside contributions because the team had a mandate to support Googlers. That is, much like other libraries and tools at Google, changes could not break google3. This means <i>testing</i> patches against google3 and either changing the patch, or fixing whatever code used it, and these are tasks that no outsider can do.<p>Shepherding these patches is no fun when you have your own changes to work on that are more important to the team.<p>We did something similar, by creating an external fork where changes could be tried out by the community, without necessarily being accepted into the internal version.<p>I think a fork <i>could</i> work if there was enough external momentum, but even 20 people working full time would actually be pretty good for an open source project. How many developers will this fork attract? The fork would need to attract other businesses who can put people on it.<p>One downside is that the code isn't tested against google3. Sometimes you find actual bugs that way.<p>Edit: reading more closely, the complaint doesn't seem to be that patches weren't reviewed, but rather that bug reports weren't investigated. That's definitely something outside developers could do more of, and seems a lot easier than forking?",
          "title": "comment"
        },
        "response_id": "af15ca55-c45c-48a8-a18d-2fcbd7dd8f3b",
        "id": 41975765,
        "internal_url": "https://news.ycombinator.com/item?id=41975765"
      }
    ]
  }
}

Enter fullscreen mode Exit fullscreen mode

Scrape all images from website and store them locally

./fitter_cli_v1.0.18-darwin-amd64 --url=https://raw.githubusercontent.com/PxyUp/fitter/refs/heads/master/examples/cli/config_image_multiple.json
Enter fullscreen mode Exit fullscreen mode
[
        "/Users/pxyup/fitter/bin/1/basic-image.png",
        "/Users/pxyup/fitter/bin/2/alt-text.png",
        "/Users/pxyup/fitter/bin/3/no-size.png",
        "/Users/pxyup/fitter/bin/4/size.png",
        "/Users/pxyup/fitter/bin/5/image-with-title.png"
]
Enter fullscreen mode Exit fullscreen mode

Fitter

Fitter it is extended version of the Fitter CLI which have:

  1. Http server for trigger
  2. Return response as telegram message/webhook
  3. And that currently did not have documentation :)

Usage

This tools can be used in different purpose:

  1. Web scrapper
  2. Data scrapper with plugins
  3. Produce specific load testing
  4. Build chat bots - i use it for automate my telegram channel

For example this job every day send best Dev.to aritcles:

Plans

  1. Add more browser tools. Like click/scroll(currently can be done only by JS injection)
  2. Improve template syntax
  3. Add custom template editor + config editor
  4. May be will think about SASS for fitter CLI for run custom workflow for customers and return result to APP/Watch/etc.

Really wait for your feedback! Ask any question i will ask

Top comments (0)