DEV Community

Siddharth Chandra
Siddharth Chandra

Posted on

1

BeautifulSoup ! The only option ?

There are lot of ways in which a developer can scrap web using Python, but why we tend to rely on BeautifulSoup as our first choice or the only choice ??

Is it because when we google web scraping using python, we get a whole lot of links for BeautifulSoup tutorial ? Or because we actually know the benefits of using BeautifulSoup ?

Same functionality can be achieved simply by using urllib library, but it has its own limitations, one such limitation is writing several methods from scratch that are readily available in BeautifulSoup.

On the other hand, writing methods from scratch can help us define custom behaviour !

Sometimes HTML is so disorganised that BeautifulSoup may not interpret the HTML tags properly.

There are forms we may need to scrap, then we would need something extra - MechanicalSoup !

Yes, another ‘SOUP’ library (don’t know why scraping community loves soup so much or is it Software Of Unknown Pedigree ?)

There are so many modules to do a particular task, why aren’t we making a pros/cons list of those but simply following what a tutorial mentions ?

If we know how to debug a code, then we should just dive into code of such open-source libraries and see for ourself whether they solve our problem the way we want.

What are your views on different scraping libraries available ? Which one do you prefer or use regularly ?

Image of Datadog

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay