DEV Community

The Hidden Web
The Hidden Web

Posted on

Cut Bright Data costs by up to 95%

Hi, I’m a professional working in web automation and business-flow automation for over 10 years.

If you use Bright Data, you know it can get really expensive fast.
A lot of what I do is to cut Bright Data costs for clients by up to 95%.

This is how I do it so you can reduce your BD cost now:

  1. Understand what they are paying for

  2. Create custom cookies to allow authentication

  3. Create direct access between client and data


👾I will be focusing on the concepts here, because with AI nowadays, it handles most of the coding work.

Whether you’re hiring a developer or trying to DIY,** I’ve found that real understanding is far more valuable than technical details **

P.S. I use Cursor, which comes with AI. Very friendly interface.

Lower Bright Data Cost for your Business

1) Understand What You Are Paying For

Bright Data mostly sells proxies and data.
There are two types of data they sell:

  • ** Dataset **– BrightData scraped the website and hosts a database. You pay a fee to access the data.

  • Web scraper tools – BrightData charges you for bypassing the websites.

The dataset option is already cheaper, but the biggest drawback is its low accuracy.

Most of my clients found the dataset not fresh enough to be used for real market products.

This is what most people don’t understand — **what the scraper tool is doing is handling bypassing for you**.


Requests

When you try to access a website, your software or website sends a request to the website.

When using software, it is often blocked.

requests often gets block by websites

BrightData’s scraper tool is charging you to bypass this block.

Once your request is allowed, you can obtain data.

In other words, if you can handle the bypassing, you will no longer need BD.



2) Create custom cookies to allow authentication

Websites evaluate your cookies to determine whether to allow your requests.

🍪Cookies are small browser-stored values that control session state and access.

If a website is an event, a cookie is like the ticket to enter the event.

Cookies are small browser-stored values that control session state and access.

Since websites typically will block your request, most of the work is simulating cookies.


To create the cookies:

  1. Find out the minimum cookies you need(they come is sets)

  2. Generate or bulk-collect them

You can reverse engineer and create the cookies yourself. Or, collect them from user browsers.

It might sound a little complicated, but for a lot of websites, it is much easier than you think.

Once you've done that you'll have functional cookies.



3) Create direct access between client and data

Now all you have to do is to** inject cookies into your requests.**

That’s basically attaching them to the request headers.(You can always give AI the cookies, and tell it to handle the rest)

Once you have functional requests, instead of sending a request to BD, you send it directly to the website.

Example: Let’s say I am using BD’s Amazon Scraper Tool “Collect By URL.” What I would do here is send a request directly to Amazon using the URL as my input.

🎉Voila! At this point, all of the cost** spent on data **from BD is eliminated!

Proxies

I will then switch to cheaper proxy services to cut costs further.
BD has really high-quality proxies, but does your project need them?

I ususally tried a lot of cheaper proxies.

There is a bit of mix-and-matching going on.

But the goal here is to find a service that is just pricey enough for your use case.

This is how I eliminate the cost of data from BD and lower the cost of the proxies.
And for most of my clients, this leads to a 70–95% price reduction.

The more data you are getting from BD, the more effective this method would be.



P.S. I made a video applying this same method to get Amazon data.It's a very fun watch haha 🔥➜Here

Top comments (0)