DEV Community

Isaac Addis
Isaac Addis

Posted on • Originally published at isaacaddis.github.io

Building an LLM-powered Facebook Marketplace Bot

#ai

I was interested in trying out bot development in Ethiopia’s emerging tech industry by creating a bot that would monitor product listings on Facebook Marketplace. I succeeded in my attempts to pass through bot detection using a simple heuristic: mimic a real human. As I eventually discarded this bot (Facebook Marketplace already has a “Notify Me” feature), I am open to sharing how I developed this.

Please note that this does violate Facebook ToS.

System Overview

Here’s a summary of our tech stack:

  • Digital Ocean VPS running Ubuntu
  • Better Stack for logs (we had to use raw HTTPS requests with helper functions – this wasn’t working with the pino library)
  • pm2 for process management (this makes it easy to run a script as an always-on background task)
  • Slack for alerts
  • better-sqllite3 to track alerted listings (to avoid alerting about the same product twice)
  • OpenAI /chat/completions API with gpt-4o-mini for filtering product listings
  • TypeScript + OOP (I’ve found that bot development works really well with OOP)

Something interesting we did was use LLMs to filter out listings unrelated to our product query. For example, when the bot searches for "iPhone 15"s, the LLM filters out listings for "iPhone 15 Pro"s. We used gpt-4o-mini for this because of its high speed and low cost and had perfect results.

Bot alert showing an iPhone 13 Pro Max deal

Technical Challenges + Solutions

Captcha Harvesting

Note: being signed-in is not required for monitoring Facebook Marketplace

Captchas prevent signing in directly through the bot. To get past this, I created a captcha harvester script that opened Puppeteer on my local computer and saved subsequent authentication cookies onto my local machine. With the scp command, I moved the cookies onto my VPS. This works for getting past login walls on Facebook.

Proxy Rotation

I’ve noticed that I needed to rotate proxies from our ProxyManager class when:
No amount of captchas would get me signed-in
I would suddenly get a login wall when trying to monitor products

Residential Proxies

I used residential proxies to make it seem like traffic was coming from realistic IPs.

Puppeteer Stealth Mode

LLMs suggest this. Something crucial from this plugin is the obscuring of navigator.webdriver, something that identifies Puppeteer/Playwright usage.

User Agent

I configured the browser user agent to Windows: await page.setUserAgent(Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36); despite running on a Linux VPS.

Tips

  • Create scripts with LLMs to debug things. For example, I needed to create a script to check if Facebook recognized my authentication cookies with the Puppeteer settings we use in the bot
    Avoid alerting on obvious scams by filtering out prices that seem too low

  • Approximating what a human user would do proved to be a good tactic for getting past bot protection on Facebook Marketplace.

Techniques used here to get around bot protection should be applicable to other services.

Top comments (0)