DEV Community

Waylon Walker
Waylon Walker

Posted on • Originally published at waylonwalker.com

4 1

Set User Agent on pandas read_csv

I keep a small cars.csv on my website for quickly trying out different pandas operations. It's very handy to keep around to help what a method you are unfamiliar with does, or give a teammate an example they can replicate.

Hosts switched

I recently switched hosting from netlify over to cloudflare. Well cloudflare does some work to block certain requests that it does not think is a real user. One of these checks is to ensure there is a real user agent on the request.

Not my go to dataset 😭

This breaks my go to example dataset.

pd.read_csv("https://waylonwalker.com/cars.csv")

# HTTPError: HTTP Error 403: Forbidden
Enter fullscreen mode Exit fullscreen mode

But requests works???

What's weird is, requests still works just fine! Not sure why using urllib the way pandas does breaks the request, but it does.

requests.get("https://waylonwalker.com/cars.csv")

<Response [200]>
Enter fullscreen mode Exit fullscreen mode

Setting the User Agent in pandas.read_csv

this fixed the issue for me!

After a bit of googling I realize that this is a common thing, and that setting the user-agent fixes it. This is the point I remember seeing in the cloudflare dashbard that they protect against a lot of different attacks, aparantly it treats pd.read_csv as an attack on my cloudflare pages site.

pd.read_csv("https://waylonwalker.com/cars.csv", storage_options = {'User-Agent': 'Mozilla/5.0'})

# success
Enter fullscreen mode Exit fullscreen mode

Now my data is back

Now this works again, but it feels like just a bit more effort than I want to do by hand. I might need to look into my cloudflare settings to see if I can allow this dataset to be accessed by pd.read_csv.

Speedy emails, satisfied customers

Postmark Image

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay