DEV Community

Cover image for Preliminary Exploration of Python Crawler Application on FMZ Platform -- Crawling the Content of Binance Announcement
FMZQuant
FMZQuant

Posted on

Preliminary Exploration of Python Crawler Application on FMZ Platform -- Crawling the Content of Binance Announcement

Recently, I saw that there is no relevant information about Python crawlers in the community and library, based on the spirit of all-round development of QUANT, I learned some concepts and knowledge related to crawlers simply. After some understanding, I found that the "pit" of "crawler technology" is quite large. This article is just for the preliminary study of "crawler technology". Do the simplest practice on the FMZ Quant Trading platform about the crawler technology.

Demand

For those who like to subscribe new shares, they always hope to get the information of currency on the exchange at the first time. It is obviously unrealistic for people to monitor the exchange website all the time. Then you need to use the crawler script to monitor the exchange announcement page and detect new announcements so that you can be notified and reminded at the first time.

Preliminary Exploration

A very simple program is used to get started (a really powerful crawler script is far more complex, so take your time first). The program logic is very simple. It allows the program to access the announcement page of the exchange constantly, parse the obtained HTML content, and detect whether the specific label content is updated.

Implementation code

You can use some useful crawler frameworks. However, considering that the requirements are very simple, you can write them directly.

The following python libraries need to be used:
Requests, which can be simply understood as a library used to access web pages.
Bs4, which can be simply understood as a library used to parse HTML code on web pages.

Code:

from bs4 import BeautifulSoup
import requests

urlBinanceAnnouncement = "https://www.binancezh.io/en/support/announcement/c-48?navId=48"  # Binance announcement page address

def openUrl(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36'}
    r = requests.get(url, headers=headers)     # Use the requests library to access the url, i.e. the address of the Binance announcement page
    if r.status_code == 200:
        r.encoding = 'utf-8'
        # Log("success! {}".format(url))
        return r.text                          # Return page content text if access is successful
    else:
        Log("failed {}".format(url))


def main():
    preNews_href = ""
    lastNews = ""
    Log("watching...", urlBinanceAnnouncement, "#FF0000")
    while True:
        ret = openUrl(urlBinanceAnnouncement)
        if ret:
            soup = BeautifulSoup(ret, 'html.parser')                       # Parse web text into objects
            lastNews_href = soup.find('a', class_='css-1ej4hfo')["href"]   # Find a specific tag, get href
            lastNews = soup.find('a', class_='css-1ej4hfo').get_text()     # Get the content in this tag
            if preNews_href == "":
                preNews_href = lastNews_href
            if preNews_href != lastNews_href:                              # A new announcement is generated when a label change is detected
                Log("New Cryptocurrency Listing update!")                  # Print the prompt message
                preNews_href = lastNews_href
        LogStatus(_D(), "\n", "preNews_href:", preNews_href, "\n", "news:", lastNews)
        Sleep(1000 * 10)
Enter fullscreen mode Exit fullscreen mode

Operation

Image description

Image description

Image description

It can be extended, for example, when a new announcement is detected. Analyze the new currency in the announcement, and place an order automatically to subscribe new shares.

From: https://blog.mathquant.com/2022/12/16/preliminary-exploration-of-python-crawler-application-on-fmz-platform-crawling-the-content-of-binance-announcement.html

Billboard image

Monitor more than uptime.

With Checkly, you can use Playwright tests and Javascript to monitor end-to-end scenarios in your NextJS, Astro, Remix, or other application.

Get started now!

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay