The average job seeker spends 11 hours per week searching for jobs, according to LinkedIn. For tech roles, it's even worse, you're dealing with hundreds of postings across multiple platforms. When my partner started her job search, she was spending hours daily just scrolling through LinkedIn. There had to be a better way.
The Challenge
For Web developers, the market is overwhelming. A single search for "Frontend Developer" in London returned 401 results. Each posting requires:
- 5 seconds to review the title
- 3-4 clicks to view details
- 30-60 seconds to scan requirements
- Manual copy-pasting to track interesting roles
- Constant tab switching and back-navigation
For 401 jobs, that's hours of pure mechanical work!
The Solution: Automation Pipeline
I built a three-step automation pipeline that cut the process down to 10 minutes:
- Scrape job data using Python
- Filter in bulk using Google Sheets
- Review only the most promising matches
Step 1: Smart Scraping
I used JobSpy as the base and built JobsParser to handle:
- CLI
- Rate limiting (to avoid LinkedIn blocks)
- Retry logic for failed requests
Here's how to run it:
pip install jobsparser
Bonus: search manually on LinkedIn the number of results for your search term and use it for the --results-wanted
parameter.
jobsparser \
--search-term "Frontend Developer" \
--location "London" \
--site linkedin \
--results-wanted 200 \
--distance 25 \
--job-type fulltime
If jobsparser
is not in your path, you can run it as a module directly:
python -m jobsparser \
--search-term "Frontend Developer" \
--location "London" \
--site linkedin \
--results-wanted 200 \
--distance 25 \
--job-type fulltime
The output is a CSV
with rich data:
- Job title and company
- Full description
- Job type and level
- Posted date
- Direct application URL
JobSpy and JobsParser are also compatible with other job boards like LinkedIn, Indeed, Glassdoor, Google & ZipRecruiter.
Step 2: Bulk Filtering
While pandas seemed obvious (and I've given it a fair try), Google Sheets proved more flexible. Here's my filtering strategy:
- Time Filter: Last 7 days
- Jobs older than a week have lower response rates
- Fresh postings mean active hiring
- Experience Filter: "job_level" matching your experience:
For my partner who is looking for her first role, I filtered:
- "Internship"
- "Entry Level"
- "Not Applicable"
-
Tech Stack Filter: "description" contains:
- The word "React"
More complex filters can be created to check for multiple technologies.
This cut 401 jobs down to 8 matches!
Step 3: Smart Review
For the filtered jobs:
- Quick scan of title/company (10 seconds)
- Open promising
job_url
in new tab - Check the description in detail.
Conclusion
I hope this tool helps you make your job search a slightly more enjoyable experience.
If you have any questions or feedback, please let me know.
Top comments (17)
I created a similar project (but with js). It worked well few months ago but I see now, that they added captchas.
I added also the possibility to scan the job description with ai (openai). I said that I only need a specific job, with some constraints and the ai API returned json with the percentage of "fit", salary, location and an explanation why this job is a fit. Most of the jobs were crap (based on my requirements) and out of maybe 1000 jobs, 50 were good enough. But yeah, ai saved me some time going through many job postings.
I wish there was an API or something so we didn't have to use shady scraping tactics. They probably have an API, but probably only for HR to post jobs.
There is a LinkedIn API, but it's extremely expensive. I implemented a scraper with beautifulsoup+pyspark also to track emerging requested skills and it worked pretty well.
I have been looking into a programmatic approach into job search, very much like what you are posting here, but always get into the issue of scrapping in terms of danger of being locked out or blocked by Indeed and LinkedIn.
Doing it from the same IP that you use to login, how does it work for you?
I am definitely going to give it a try, that would simplify the search by far.
I've just pushed version 0.1.4 that connects with jobspy proxy option:
E.g. --proxies '208.195.175.46:65095' --proxies '208.195.175.45:65095'
Thanks for your comment, I totally missed this possibility.
This would be my concern as well; besides the fact that scraping the site might be violating the robots.txt convention too. Doesn't LinkedIn at least support some sort of approved API for job listings? I mean if they do and they're charging for using the API then scraping the site definitely seems unethical and possibly slightly illegal.
Well, Job listings and job search are two different things. They surely do support job listings API as there are tons of other HR systems that post to linkedin.
It's all about money and since they have promoted job listings, LInkedin does not have a publicly available api to search for jobs, forcing people to use their site with very broken search.
I disagree about unethical since it's not their proprietary information and basically public posting.
This is the plain language of their robots.txt:
"# Notice: The use of robots or other automated means to access LinkedIn without
the express permission of LinkedIn is strictly prohibited.
See linkedin.com/legal/user-agreement.
LinkedIn may, in its discretion, permit certain automated access to certain LinkedIn pages,
for the limited purpose of including content in approved publicly available search engines.
If you would like to apply for permission to crawl LinkedIn, please email whitelist-crawl@linkedin.com.
Any and all permitted crawling of LinkedIn is subject to LinkedIn's Crawling Terms and Conditions.
See linkedin.com/legal/crawling-terms."
Like it or not, unless you've got their permission to scrape the site, you're violating their robots.txt. That's at least unethical and probably illegal (although I don't believe they'd bother to sue anyone).
I don't disagree with what you said.
As far as I know, this is an ethical concern. If we ignore, we might get some restrictions. But I think, we need to think, why they restricted in the first place, let's think about why. They noted that to revent softwares doing something harmful by scarpping their website. But if it provides one of the ways to use the website then, it would be good for both providers like linkedin and clients.
I am not sure what you mean by "get some restrictions", it is a closed system, I don't see a way to have more restrictions.
Why is clear. LinkedIn has promoted jobs that will appear on your search no matter what you search. It's their income. It is a company, it's understandable that is it about money. If they provided an API to search for jobs there would be a ton of recruiting services to offer proper search to their clients and LI would loose their promotional income.
What I don't like is that LI has a massive amount of jobs that they are keeping within their system without providing a proper way to search them. I think that's unfair.
I am good to use their search and notifications if they were properly done and useful, especially when IT industry is a mess
Yes, I see. I was thinking of small supports. I think you’re right. That would be loss for both, providers, and clients, if users don’t use the service as the provider expected. Thanks for sharing your thought 👍
Scraping is not a problem. Applying is. Great project nonetheless
thanks easier than to use an AI agent
This is a neat article. I especially appreciate the comment thread below about LinkedIn's policy on web scraping. If I ever try this, I'll be sure to apply for a white-listing first!
Awesome Thanks for this
Recently i read about datasheets and web scrapping in linkedin, good post! :)