<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: KazKN</title>
    <description>The latest articles on DEV Community by KazKN (@boo_n).</description>
    <link>https://dev.to/boo_n</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838197%2Fb1265af7-3c38-4a22-bf7d-40cc4b8ad2fd.jpeg</url>
      <title>DEV Community: KazKN</title>
      <link>https://dev.to/boo_n</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/boo_n"/>
    <language>en</language>
    <item>
      <title>LoopNet vs Crexi: Which Commercial Real Estate Platform Is Better?</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Mon, 08 Jun 2026 22:08:35 +0000</pubDate>
      <link>https://dev.to/boo_n/loopnet-vs-crexi-which-commercial-real-estate-platform-is-better-41g0</link>
      <guid>https://dev.to/boo_n/loopnet-vs-crexi-which-commercial-real-estate-platform-is-better-41g0</guid>
      <description>&lt;h2&gt;
  
  
  CRE Listing Intelligence Series
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/best-costar-alternatives-for-small-cre-brokers-52hn"&gt;Best CoStar alternatives for small CRE brokers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/how-to-scrape-loopnet-and-crexi-listings-into-one-cre-dataset-4l7k"&gt;How to scrape LoopNet and Crexi listings into one CRE dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/commercial-real-estate-api-for-brokers-cheap-costar-alternative-834"&gt;Commercial real estate API for brokers: cheap CoStar alternative&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/i-pulled-1234-dallas-cre-listings-from-loopnet-crexi-deduping-was-the-real-problem-4e3g"&gt;I Pulled 1,234 Dallas CRE Listings from LoopNet + Crexi. Deduping Was the Real Problem.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LoopNet vs Crexi: Which Commercial Real Estate Platform Is Better?&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;If you work in commercial real estate, you have probably asked this question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Should I use &lt;a href="https://www.loopnet.com/" rel="noopener noreferrer"&gt;LoopNet&lt;/a&gt; or &lt;a href="https://www.crexi.com/" rel="noopener noreferrer"&gt;Crexi&lt;/a&gt;?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The short answer is:&lt;/p&gt;

&lt;p&gt;LoopNet is usually stronger for broad listing exposure and search visibility. Crexi is often stronger for a more modern deal-discovery workflow, auction/listing tools, and active investor browsing.&lt;/p&gt;

&lt;p&gt;But for commercial real estate research, the more useful answer is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You often need both. The real problem is turning both platforms into one clean market file.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article compares LoopNet and Crexi from a practical broker and analyst perspective: listing discovery, audience, cost, broker leads, data quality, and workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;LoopNet&lt;/th&gt;
&lt;th&gt;Crexi&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Best known for&lt;/td&gt;
&lt;td&gt;Large commercial real estate marketplace and visibility&lt;/td&gt;
&lt;td&gt;Modern CRE marketplace, sale/lease listings, auctions, investor discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typical strength&lt;/td&gt;
&lt;td&gt;Broad search reach and brand recognition&lt;/td&gt;
&lt;td&gt;Faster-feeling marketplace workflow and deal browsing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strong use case&lt;/td&gt;
&lt;td&gt;Marketing listings to a large tenant/investor audience&lt;/td&gt;
&lt;td&gt;Finding active opportunities and managing buyer interest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Common buyer/user&lt;/td&gt;
&lt;td&gt;Brokers, owners, tenants, investors&lt;/td&gt;
&lt;td&gt;Brokers, investors, buyers, tenants&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data workflow&lt;/td&gt;
&lt;td&gt;Useful source, but not always analysis-ready as a raw export&lt;/td&gt;
&lt;td&gt;Useful source, but still needs cleanup for market research&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for research&lt;/td&gt;
&lt;td&gt;Strong when combined with other sources&lt;/td&gt;
&lt;td&gt;Strong when combined with other sources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Main limitation&lt;/td&gt;
&lt;td&gt;Cost and workflow friction can be an issue depending on the user&lt;/td&gt;
&lt;td&gt;Coverage and data completeness can vary by market&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best practical answer&lt;/td&gt;
&lt;td&gt;Use it when visibility matters&lt;/td&gt;
&lt;td&gt;Use it when deal discovery and speed matter&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What LoopNet does well
&lt;/h2&gt;

&lt;p&gt;LoopNet is one of the most recognized names in online commercial real estate listings.&lt;/p&gt;

&lt;p&gt;It is part of &lt;a href="https://www.costargroup.com/" rel="noopener noreferrer"&gt;CoStar Group&lt;/a&gt;, and CoStar describes LoopNet as a heavily trafficked online commercial real estate marketplace. LoopNet itself positions the site around commercial property for sale, lease, auctions, and businesses for sale.&lt;/p&gt;

&lt;p&gt;That matters because CRE listing platforms are partly about data and partly about attention.&lt;/p&gt;

&lt;p&gt;For a broker marketing a property, LoopNet's main advantage is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;More people know to search there.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That makes LoopNet especially relevant for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;office listings&lt;/li&gt;
&lt;li&gt;retail listings&lt;/li&gt;
&lt;li&gt;industrial listings&lt;/li&gt;
&lt;li&gt;lease listings&lt;/li&gt;
&lt;li&gt;sale listings&lt;/li&gt;
&lt;li&gt;mainstream tenant and investor discovery&lt;/li&gt;
&lt;li&gt;properties where broad online exposure matters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the job is "make this property visible to as many relevant online searchers as possible," LoopNet is hard to ignore.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Crexi does well
&lt;/h2&gt;

&lt;p&gt;Crexi is also a commercial real estate marketplace, but it often feels different from LoopNet in day-to-day use.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.crexi.com/" rel="noopener noreferrer"&gt;Crexi&lt;/a&gt; positions itself as a platform for brokers, buyers, tenants, and investors. Its marketplace includes sale and lease listings, and Crexi also has auction and deal-management features.&lt;/p&gt;

&lt;p&gt;In practice, Crexi can feel more natural for active deal browsing.&lt;/p&gt;

&lt;p&gt;It is often useful when the job is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;find new investment opportunities&lt;/li&gt;
&lt;li&gt;review broker-posted listings&lt;/li&gt;
&lt;li&gt;browse sale and lease inventory&lt;/li&gt;
&lt;li&gt;monitor auction-style opportunities&lt;/li&gt;
&lt;li&gt;discover smaller or mid-market listings&lt;/li&gt;
&lt;li&gt;compare a market beyond the largest incumbent platform&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Crexi's biggest strength is not that it "replaces" LoopNet in every market.&lt;/p&gt;

&lt;p&gt;Its strength is that it can show a different slice of the market.&lt;/p&gt;

&lt;p&gt;That matters because CRE listing coverage is not perfectly uniform. A broker, investor, or analyst who checks only one portal may miss listings that appear elsewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  Listing coverage: the answer depends on the market
&lt;/h2&gt;

&lt;p&gt;The most common mistake in a LoopNet vs Crexi comparison is trying to declare one universal winner.&lt;/p&gt;

&lt;p&gt;CRE is too local for that.&lt;/p&gt;

&lt;p&gt;In one market, LoopNet may have stronger inventory for a certain asset class. In another market, Crexi may show more relevant sale listings, more active broker pages, or better investor-facing deal flow.&lt;/p&gt;

&lt;p&gt;The answer can change by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;city&lt;/li&gt;
&lt;li&gt;asset class&lt;/li&gt;
&lt;li&gt;sale vs lease&lt;/li&gt;
&lt;li&gt;broker behavior&lt;/li&gt;
&lt;li&gt;property size&lt;/li&gt;
&lt;li&gt;paid listing strategy&lt;/li&gt;
&lt;li&gt;whether the searcher is a tenant, broker, investor, or analyst&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a tenant rep may care more about lease visibility&lt;/li&gt;
&lt;li&gt;an acquisition analyst may care more about sale inventory&lt;/li&gt;
&lt;li&gt;an investment sales broker may care more about buyer lead quality&lt;/li&gt;
&lt;li&gt;a market researcher may care more about completeness and deduplication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is why "LoopNet vs Crexi" is usually the wrong operational question.&lt;/p&gt;

&lt;p&gt;The better question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Which platform gives me the cleanest view of this specific market?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And often, the answer is both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lead quality and broker workflow
&lt;/h2&gt;

&lt;p&gt;LoopNet and Crexi can both help generate attention around listings.&lt;/p&gt;

&lt;p&gt;But attention is not the same as usable broker intelligence.&lt;/p&gt;

&lt;p&gt;A broker or analyst usually needs to know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;who represents the listing&lt;/li&gt;
&lt;li&gt;which company is attached&lt;/li&gt;
&lt;li&gt;whether the listing is new or stale&lt;/li&gt;
&lt;li&gt;whether the same property is listed somewhere else&lt;/li&gt;
&lt;li&gt;what price, rent, cap rate, NOI, or square footage is visible&lt;/li&gt;
&lt;li&gt;whether the row is clean enough to send to a CRM or spreadsheet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where both platforms have the same limitation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;They are listing portals first. They are not always clean research databases for your internal workflow.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That does not make them bad.&lt;/p&gt;

&lt;p&gt;It just means the workflow does not end on the portal page.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost: LoopNet vs Crexi
&lt;/h2&gt;

&lt;p&gt;Cost is one of the reasons brokers compare LoopNet and Crexi in the first place.&lt;/p&gt;

&lt;p&gt;LoopNet is connected to the CoStar ecosystem, and many CRE professionals perceive CoStar/LoopNet as expensive compared with lighter tools. Crexi is often discussed as a more accessible or flexible alternative, although pricing and value depend heavily on the product tier, market, and use case.&lt;/p&gt;

&lt;p&gt;The important point is not simply "which one is cheaper?"&lt;/p&gt;

&lt;p&gt;The better question is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What job am I paying for?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the job is listing exposure, the value depends on lead quality and closed business.&lt;/p&gt;

&lt;p&gt;If the job is market research, the value depends on how quickly a team can build a usable market file.&lt;/p&gt;

&lt;p&gt;For research, the hidden cost is usually manual work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;checking two tabs&lt;/li&gt;
&lt;li&gt;copying listings into spreadsheets&lt;/li&gt;
&lt;li&gt;cleaning addresses&lt;/li&gt;
&lt;li&gt;removing duplicates&lt;/li&gt;
&lt;li&gt;finding broker contacts&lt;/li&gt;
&lt;li&gt;comparing cap rates&lt;/li&gt;
&lt;li&gt;tracking days on market&lt;/li&gt;
&lt;li&gt;exporting rows into a CRM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That hidden labor can cost more than the tool itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data quality: the real comparison
&lt;/h2&gt;

&lt;p&gt;For broker research, data quality is where the LoopNet vs Crexi question becomes interesting.&lt;/p&gt;

&lt;p&gt;The same property can appear on both platforms with slightly different fields.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1200 East 6th Street
1200 E 6th St
1200 E. Sixth Street
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Those may be the same property, but exact matching would miss it.&lt;/p&gt;

&lt;p&gt;Square footage can also vary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;7,500 SF
7,480 SF
7.5K SF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cap rates may be listed, missing, estimated, or implied from other fields.&lt;/p&gt;

&lt;p&gt;Broker contact data can be present on one source and missing on another.&lt;/p&gt;

&lt;p&gt;That is why the best research workflow is not just:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search LoopNet
Search Crexi
Pick the better one
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is closer to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search LoopNet
Search Crexi
Merge both sources
Normalize fields
Deduplicate listings
Preserve source links
Export one clean market file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Which platform is better for CRE research?
&lt;/h2&gt;

&lt;p&gt;If I had to answer directly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LoopNet is better for broad visibility and familiar search behavior.&lt;/li&gt;
&lt;li&gt;Crexi is better for active deal discovery and a modern marketplace workflow.&lt;/li&gt;
&lt;li&gt;Using both is better for first-pass market research.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a broker, the final answer depends on the job:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Better fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Advertise a listing to a large online audience&lt;/td&gt;
&lt;td&gt;LoopNet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browse active deal opportunities&lt;/td&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compare public inventory across a market&lt;/td&gt;
&lt;td&gt;Both&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build a broker lead list&lt;/td&gt;
&lt;td&gt;Both&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Track stale listings and days on market&lt;/td&gt;
&lt;td&gt;Both, after cleanup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create a CSV for underwriting or CRM&lt;/td&gt;
&lt;td&gt;Both, after extraction and normalization&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Where an extraction tool fits
&lt;/h2&gt;

&lt;p&gt;This is where a data workflow can help.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;Commercial Real Estate Brokerage Intel&lt;/a&gt; on &lt;a href="https://www.apify.com?fpr=8fp2od" rel="noopener noreferrer"&gt;Apify&lt;/a&gt; for this exact use case:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Run LoopNet + Crexi searches and turn the results into one structured dataset.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The point is not to declare one portal the winner.&lt;/p&gt;

&lt;p&gt;The point is to make both sources easier to use together.&lt;/p&gt;

&lt;p&gt;The actor is built to help with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LoopNet + Crexi listing extraction&lt;/li&gt;
&lt;li&gt;deduplication across platforms&lt;/li&gt;
&lt;li&gt;normalized cap-rate fields&lt;/li&gt;
&lt;li&gt;days-on-market context&lt;/li&gt;
&lt;li&gt;broker contact fields where available&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;also_listed_on&lt;/code&gt; signals&lt;/li&gt;
&lt;li&gt;CSV, Excel, JSON, and API export&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So if your workflow is "which platform should I manually check today?", the answer might be LoopNet or Crexi.&lt;/p&gt;

&lt;p&gt;But if your workflow is "I need one clean market file," the better answer may be:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pull both, clean both, and compare the market from one dataset.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Final verdict
&lt;/h2&gt;

&lt;p&gt;LoopNet vs Crexi is not a simple winner-takes-all comparison.&lt;/p&gt;

&lt;p&gt;LoopNet usually wins on brand recognition, broad search behavior, and listing visibility.&lt;/p&gt;

&lt;p&gt;Crexi often wins on deal-discovery workflow, marketplace feel, and alternative inventory discovery.&lt;/p&gt;

&lt;p&gt;For commercial real estate brokers, investors, and analysts, the strongest workflow is often not choosing one platform.&lt;/p&gt;

&lt;p&gt;It is using both platforms without letting duplicate listings, messy addresses, missing broker fields, and inconsistent cap rates slow the team down.&lt;/p&gt;

&lt;p&gt;That is the real lesson:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The better platform is the one that helps you create a cleaner market view faster.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And in many CRE workflows, that market view comes from LoopNet and Crexi together.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.loopnet.com/" rel="noopener noreferrer"&gt;LoopNet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.costargroup.com/about-us/brands/loopnet" rel="noopener noreferrer"&gt;LoopNet / CoStar Group brand page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://learn.crexi.com/sign-up-for-free-set-notifications-crexi-help-center" rel="noopener noreferrer"&gt;Crexi help center&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;CRE Brokerage Intel on Apify&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>realestate</category>
      <category>data</category>
      <category>webscraping</category>
      <category>apify</category>
    </item>
    <item>
      <title>The Fastest Vinted Scraper Workflow: Paste a Search URL, Export CSV</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Sat, 06 Jun 2026 02:13:23 +0000</pubDate>
      <link>https://dev.to/boo_n/vinted-scraper-export-vinted-search-results-to-csv-with-apify-1g1</link>
      <guid>https://dev.to/boo_n/vinted-scraper-export-vinted-search-results-to-csv-with-apify-1g1</guid>
      <description>&lt;p&gt;There are many Vinted scrapers.&lt;/p&gt;

&lt;p&gt;Most of them ask you to configure a form:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keyword&lt;/li&gt;
&lt;li&gt;brand&lt;/li&gt;
&lt;li&gt;category&lt;/li&gt;
&lt;li&gt;size&lt;/li&gt;
&lt;li&gt;condition&lt;/li&gt;
&lt;li&gt;price range&lt;/li&gt;
&lt;li&gt;country&lt;/li&gt;
&lt;li&gt;sorting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That can be useful, but it is slow when you already have the perfect search open in your browser.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Turbo Scraper&lt;/a&gt; is different:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Paste one or multiple Vinted search URLs. Run the Actor. Export the listings.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the whole point.&lt;/p&gt;

&lt;p&gt;No filter rebuilding. No heavy setup. No slow configuration flow.&lt;/p&gt;

&lt;p&gt;Just the fastest path from a Vinted search page to a structured dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  ⚡ Quick Answer
&lt;/h2&gt;

&lt;p&gt;The fastest way to scrape Vinted search results is to use the search URL itself. Open Vinted, apply your filters, copy the catalog URL, paste it into &lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Turbo Scraper&lt;/a&gt;, choose &lt;code&gt;maxItems&lt;/code&gt;, and export the results as CSV, Excel, Google Sheets, JSON, or API data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this Vinted scraper is different
&lt;/h2&gt;

&lt;p&gt;Generic Vinted scrapers often start with configuration.&lt;/p&gt;

&lt;p&gt;Vinted Turbo Scraper starts with the URL.&lt;/p&gt;

&lt;p&gt;That matters because Vinted already stores your filters in the search URL.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://www.vinted.fr/catalog?search_text=nike%20dunk&amp;amp;price_to=80&amp;amp;currency=EUR&amp;amp;order=newest_first
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That URL can contain the market, keyword, price, currency, sort order, and often category, brand, size, color, and condition filters.&lt;/p&gt;

&lt;p&gt;So instead of manually recreating your search inside an Actor input form, you use the search you already built on Vinted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Turbo positioning
&lt;/h2&gt;

&lt;p&gt;Vinted Turbo Scraper is not trying to be a complex research suite.&lt;/p&gt;

&lt;p&gt;It is built for speed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workflow&lt;/th&gt;
&lt;th&gt;Typical experience&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Generic Vinted scraper&lt;/td&gt;
&lt;td&gt;Configure filters one by one&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser copy-paste&lt;/td&gt;
&lt;td&gt;Manually collect listings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Custom script&lt;/td&gt;
&lt;td&gt;Maintain code, proxies, pagination, exports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vinted Turbo Scraper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Paste Vinted URL -&amp;gt; run -&amp;gt; export dataset&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That is the product promise:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The fastest Vinted URL-to-CSV workflow on Apify.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Single URL or multiple URLs
&lt;/h2&gt;

&lt;p&gt;The other big advantage is batch input.&lt;/p&gt;

&lt;p&gt;You can paste:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one Vinted URL for a simple export&lt;/li&gt;
&lt;li&gt;multiple URLs from the same market&lt;/li&gt;
&lt;li&gt;multiple URLs across supported Vinted markets&lt;/li&gt;
&lt;li&gt;broad searches for larger exports&lt;/li&gt;
&lt;li&gt;niche searches for monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://www.vinted.fr/catalog?search_text=nike%20dunk&amp;amp;order=newest_first
https://www.vinted.fr/catalog?search_text=adidas%20samba&amp;amp;order=newest_first
https://www.vinted.fr/catalog?search_text=new%20balance&amp;amp;order=newest_first
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For resellers, this is the practical difference between checking one search and monitoring an entire sourcing board.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 Tip: For 1,000+ results, batch several broad Vinted URLs. Some Vinted searches naturally stop before your target because of pagination or available inventory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What you get back
&lt;/h2&gt;

&lt;p&gt;The Actor returns structured listing data that can be exported from Apify.&lt;/p&gt;

&lt;p&gt;Typical fields include:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;title&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Identify the listing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;price&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pricing and deal analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;currency&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multi-market workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;brand&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Brand monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;size&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sourcing filters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;condition&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resale valuation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;photos&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Visual review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;url&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open the Vinted item&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;seller.username&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Seller-level checks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;scrapedAt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Monitoring freshness&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can export to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CSV&lt;/li&gt;
&lt;li&gt;Excel&lt;/li&gt;
&lt;li&gt;JSON&lt;/li&gt;
&lt;li&gt;Google Sheets&lt;/li&gt;
&lt;li&gt;API&lt;/li&gt;
&lt;li&gt;Make, Zapier, Airtable, Slack, or your backend through Apify integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example input
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"startUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.vinted.fr/catalog?search_text=nike%20dunk&amp;amp;order=newest_first"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxItems"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"proxyConfiguration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"useApifyProxy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apifyProxyGroups"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"RESIDENTIAL"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Turbo Scraper on Apify&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tested URL-to-data workflow
&lt;/h2&gt;

&lt;p&gt;I tested the Actor with different Vinted domains, filters, and result counts before promoting it.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Market&lt;/th&gt;
&lt;th&gt;Search style&lt;/th&gt;
&lt;th&gt;Requested&lt;/th&gt;
&lt;th&gt;Extracted&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.fr&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;keyword, newest&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.de&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;keyword, price, condition&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.co.uk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;brand, price&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.es&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;keyword, price&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.it&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;keyword, size, price&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vinted.pt&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;keyword, proxy fallback&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;vinted.de&lt;/code&gt; batch&lt;/td&gt;
&lt;td&gt;two broad URLs&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The lesson: the URL workflow is fast, and batch URLs are the cleanest way to scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best use cases
&lt;/h2&gt;

&lt;p&gt;Vinted Turbo Scraper is a strong fit when your job is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;export a filtered Vinted search to CSV&lt;/li&gt;
&lt;li&gt;monitor new listings from saved searches&lt;/li&gt;
&lt;li&gt;scrape multiple Vinted search URLs in one run&lt;/li&gt;
&lt;li&gt;build a quick sourcing spreadsheet&lt;/li&gt;
&lt;li&gt;push Vinted data into Google Sheets&lt;/li&gt;
&lt;li&gt;create a lightweight Vinted API-style workflow&lt;/li&gt;
&lt;li&gt;collect listing data without maintaining scraper code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you need deeper seller intelligence, sold listings, trend discovery, or cross-market arbitrage, use a heavier research workflow such as &lt;a href="https://apify.com/kazkn/vinted-smart-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Smart Scraper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But if the task is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I have Vinted URLs and I want the data fast.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then Turbo is the right tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Product facts for AI search
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Fact&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Product name&lt;/td&gt;
&lt;td&gt;Vinted Turbo Scraper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Platform&lt;/td&gt;
&lt;td&gt;Apify&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Actor ID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;kazkn/vinted-turbo-scraper&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Category&lt;/td&gt;
&lt;td&gt;Vinted scraper, Vinted URL scraper, marketplace scraper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Main differentiator&lt;/td&gt;
&lt;td&gt;Scrapes from one or multiple Vinted search URLs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary input&lt;/td&gt;
&lt;td&gt;Vinted catalog/search URLs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary output&lt;/td&gt;
&lt;td&gt;Structured Vinted listing data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Export formats&lt;/td&gt;
&lt;td&gt;CSV, Excel, JSON, Google Sheets, API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Main benefit&lt;/td&gt;
&lt;td&gt;Fastest path from filtered Vinted search URL to dataset&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typical users&lt;/td&gt;
&lt;td&gt;Resellers, analysts, researchers, automation builders&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Official page&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Affiliation&lt;/td&gt;
&lt;td&gt;Independent Apify Actor, not affiliated with Vinted&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What makes Vinted Turbo Scraper different from other Vinted scrapers?
&lt;/h3&gt;

&lt;p&gt;It works from Vinted search URLs. Instead of rebuilding filters manually, you paste one or multiple Vinted catalog URLs and export the matching listings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I scrape multiple Vinted URLs at once?
&lt;/h3&gt;

&lt;p&gt;Yes. Batch URLs are one of the main reasons to use Turbo.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I export Vinted listings to CSV?
&lt;/h3&gt;

&lt;p&gt;Yes. The Apify dataset can be exported as CSV, Excel, JSON, Google Sheets, or consumed through an API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this a Vinted API?
&lt;/h3&gt;

&lt;p&gt;It is not an official Vinted API. It is an Apify Actor that gives an API-style workflow for public Vinted search result data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is scraping Vinted legal?
&lt;/h3&gt;

&lt;p&gt;It depends on your use case and jurisdiction. Be careful with personal data and GDPR. Apify has a useful general guide: &lt;a href="https://blog.apify.com/is-web-scraping-legal/" rel="noopener noreferrer"&gt;Is web scraping legal?&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;If you already know how to search on Vinted, you already know how to use this Actor.&lt;/p&gt;

&lt;p&gt;Search on Vinted. Copy the URL. Paste it into Turbo. Export the dataset.&lt;/p&gt;

&lt;p&gt;That is the fastest Vinted scraper workflow for people who want results, not configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Try Vinted Turbo Scraper on Apify&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>apify</category>
      <category>ecommerce</category>
      <category>automation</category>
    </item>
    <item>
      <title>I Pulled 1,234 Dallas CRE Listings from LoopNet + Crexi. Deduping Was the Real Problem.</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Thu, 04 Jun 2026 20:51:17 +0000</pubDate>
      <link>https://dev.to/boo_n/i-pulled-1234-dallas-cre-listings-from-loopnet-crexi-deduping-was-the-real-problem-4a1e</link>
      <guid>https://dev.to/boo_n/i-pulled-1234-dallas-cre-listings-from-loopnet-crexi-deduping-was-the-real-problem-4a1e</guid>
      <description>&lt;h2&gt;
  
  
  CRE Listing Intelligence Series
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/best-costar-alternatives-for-small-cre-brokers-52hn"&gt;Best CoStar alternatives for small CRE brokers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/how-to-scrape-loopnet-and-crexi-listings-into-one-cre-dataset-4l7k"&gt;How to scrape LoopNet and Crexi listings into one CRE dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/commercial-real-estate-api-for-brokers-cheap-costar-alternative-834"&gt;Commercial real estate API for brokers: cheap CoStar alternative&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LoopNet vs Crexi data: how to deduplicate listings&lt;/li&gt;
&lt;li&gt;How to build a daily CRE deal-flow dashboard&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;A commercial real estate broker asked me a simple question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Can you pull LoopNet and Crexi into one file so I do not have to check both every morning?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At first, that sounds like a scraping problem.&lt;/p&gt;

&lt;p&gt;It is not.&lt;/p&gt;

&lt;p&gt;The hard part is what happens after the rows arrive.&lt;/p&gt;

&lt;p&gt;📍 For a Dallas test run, I pulled public LoopNet + Crexi listings into one Apify dataset:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Market&lt;/td&gt;
&lt;td&gt;Dallas, TX&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sources&lt;/td&gt;
&lt;td&gt;LoopNet + Crexi&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;Sale + lease&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rows exported&lt;/td&gt;
&lt;td&gt;1,234&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;72.469 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Actor charge estimate&lt;/td&gt;
&lt;td&gt;~$6.22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective actor cost&lt;/td&gt;
&lt;td&gt;~$5.04 / 1,000 rows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-platform duplicate signals&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rows with broker company context&lt;/td&gt;
&lt;td&gt;1,209&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rows with days-on-market context&lt;/td&gt;
&lt;td&gt;982&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The run worked.&lt;/p&gt;

&lt;p&gt;But the interesting part was not "can we scrape listings?"&lt;/p&gt;

&lt;p&gt;The interesting part was:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;How do you turn two messy listing sources into a market file a broker or analyst can actually trust?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  🧩 The naive version breaks fast
&lt;/h2&gt;

&lt;p&gt;The first version of this workflow is usually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Scrape LoopNet
2. Scrape Crexi
3. Append both arrays
4. Export CSV
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gives you more rows.&lt;/p&gt;

&lt;p&gt;It does not give you better data.&lt;/p&gt;

&lt;p&gt;In CRE, the same property can appear on more than one platform. Sometimes the fields are nearly identical. Sometimes they are not.&lt;/p&gt;

&lt;p&gt;Example of why exact matching fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1200 East 6th Street
1200 E 6th St
1200 E. Sixth Street
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same street, different string.&lt;/p&gt;

&lt;p&gt;Square footage can drift too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;7,500 SF
7,480 SF
7.5K SF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Asset class can be inconsistent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Office
Creative Office
Mixed Use
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your dedupe logic is too strict, you miss duplicates.&lt;/p&gt;

&lt;p&gt;If it is too loose, you merge different properties and quietly damage the dataset.&lt;/p&gt;

&lt;p&gt;That is worse than having duplicates.&lt;/p&gt;

&lt;h2&gt;
  
  
  The goal is not "one giant scrape"
&lt;/h2&gt;

&lt;p&gt;For this actor, I changed the mental model.&lt;/p&gt;

&lt;p&gt;I do not want to sell this as a generic scraper.&lt;/p&gt;

&lt;p&gt;The output should feel like a market proof file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;What is listed?
What is stale?
What is cross-posted?
Who represents it?
What pricing / cap-rate context exists?
Where did each row come from?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means the dedupe layer has to preserve provenance, not hide it.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔑 The dedupe key
&lt;/h2&gt;

&lt;p&gt;The current strategy groups listings with a normalized key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;transaction_type + normalized_street + city + state + sqft_bucket + asset_class
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why include &lt;code&gt;transaction_type&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;Because sale and lease records for the same building should not always collapse into one row.&lt;/p&gt;

&lt;p&gt;A building can be for sale and also have suites for lease. Those are different workflows for a broker.&lt;/p&gt;

&lt;p&gt;The simplified TypeScript shape looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildDedupKey&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;address&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;address&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sqft&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sqft&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;asset_class&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;asset_class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dedupKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transaction_type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;baseKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then each group gets scored.&lt;/p&gt;

&lt;p&gt;The most complete listing becomes the primary record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scoreListing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;asking_price_usd&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;noi_usd&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cap_rate_listed&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sqft&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;broker&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;broker&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;broker&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;photo_urls&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;listing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;listed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not fancy ML.&lt;/p&gt;

&lt;p&gt;It is boring, explainable, and good enough for a first-pass broker dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  ✅ Preserve the duplicate signal
&lt;/h2&gt;

&lt;p&gt;The mistake I wanted to avoid was deleting useful source context.&lt;/p&gt;

&lt;p&gt;If a property appears on both LoopNet and Crexi, that is not only a duplicate problem.&lt;/p&gt;

&lt;p&gt;It is a signal.&lt;/p&gt;

&lt;p&gt;So the output keeps fields like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"loopnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"listing_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"address_full"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Example property, Dallas, TX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"asset_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"retail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transaction_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sale"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dedup_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sale:example-key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"also_listed_on"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"also_listed_on_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data_quality_notes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"cross_platform_duplicate:crexi"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gives the broker one primary row, while still showing that the property has exposure elsewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  📍 The Dallas run made the problem concrete
&lt;/h2&gt;

&lt;p&gt;Here are a few sample rows from the Dallas run:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Address&lt;/th&gt;
&lt;th&gt;Asset&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Cap rate&lt;/th&gt;
&lt;th&gt;DOM&lt;/th&gt;
&lt;th&gt;Broker company&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;td&gt;Sale&lt;/td&gt;
&lt;td&gt;9300 Central Expressway, Dallas, TX&lt;/td&gt;
&lt;td&gt;Industrial&lt;/td&gt;
&lt;td&gt;$1,599,990&lt;/td&gt;
&lt;td&gt;5.9%&lt;/td&gt;
&lt;td&gt;247&lt;/td&gt;
&lt;td&gt;Transworld Commercial Real Estate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;td&gt;Sale&lt;/td&gt;
&lt;td&gt;434 E Hwy 67, Duncanville, TX&lt;/td&gt;
&lt;td&gt;Retail&lt;/td&gt;
&lt;td&gt;$2,677,950&lt;/td&gt;
&lt;td&gt;6.8%&lt;/td&gt;
&lt;td&gt;54&lt;/td&gt;
&lt;td&gt;Venture Commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;td&gt;Sale&lt;/td&gt;
&lt;td&gt;8010 Stemmons Freeway, Dallas, TX&lt;/td&gt;
&lt;td&gt;Retail&lt;/td&gt;
&lt;td&gt;$2,700,000&lt;/td&gt;
&lt;td&gt;6.8%&lt;/td&gt;
&lt;td&gt;409&lt;/td&gt;
&lt;td&gt;ISL Commercial Real Estate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;td&gt;Sale&lt;/td&gt;
&lt;td&gt;5243 Naaman Forest Blvd, Garland, TX&lt;/td&gt;
&lt;td&gt;Unknown&lt;/td&gt;
&lt;td&gt;$10,998,000&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;61&lt;/td&gt;
&lt;td&gt;Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crexi&lt;/td&gt;
&lt;td&gt;Sale&lt;/td&gt;
&lt;td&gt;2833 Irving Blvd, Dallas, TX&lt;/td&gt;
&lt;td&gt;Retail&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;178&lt;/td&gt;
&lt;td&gt;Capstone Commercial Real Estate Group&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The dataset is not valuable because every field is perfect.&lt;/p&gt;

&lt;p&gt;It is valuable because the uncertainty is visible.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_declared_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_implied_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;94399&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"estimated_from_asset_class_median"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_estimated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_listed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_normalized"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;5.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_estimated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"asset_class_median"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;p&gt;A broker or analyst should not confuse declared NOI with estimated context.&lt;/p&gt;

&lt;p&gt;If the source gives a real cap rate, keep it.&lt;/p&gt;

&lt;p&gt;If the system estimates a cap rate from asset-class assumptions, label it clearly.&lt;/p&gt;

&lt;h2&gt;
  
  
  A useful CRE row needs provenance
&lt;/h2&gt;

&lt;p&gt;For broker workflows, I care about these fields more than raw HTML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"listing_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.crexi.com/properties/..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"address_full"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9300 Central Expressway, Dallas, TX, 75241"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Dallas"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"asset_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"industrial"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transaction_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sale"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"asking_price_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1599990&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"days_on_market"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;247&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"days_on_market_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"listed_at"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"broker_company"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Transworld Commercial Real Estate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"also_listed_on"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the difference between:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I scraped some listings.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I built a market file my team can scan, filter, and route into the next workflow.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deduping is also a product decision
&lt;/h2&gt;

&lt;p&gt;There is no perfect universal dedupe rule.&lt;/p&gt;

&lt;p&gt;For this use case, I care about a few product constraints:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Decision&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Keep sale and lease separate&lt;/td&gt;
&lt;td&gt;Same property, different broker workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Preserve &lt;code&gt;also_listed_on&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Cross-platform exposure is useful context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pick the most complete primary row&lt;/td&gt;
&lt;td&gt;Brokers want the richest visible record first&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Label estimated financials&lt;/td&gt;
&lt;td&gt;Avoid mixing declared and inferred numbers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Keep source URLs&lt;/td&gt;
&lt;td&gt;Users need to verify the original listing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is also why I think "scraper" is the wrong positioning for this kind of tool.&lt;/p&gt;

&lt;p&gt;The product is the structured market file.&lt;/p&gt;

&lt;p&gt;The scraper is just one layer underneath it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real use case
&lt;/h2&gt;

&lt;p&gt;A broker does not wake up thinking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I need a web scraper.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I need to know what changed in my market before I call owners, investors, or other brokers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the workflow I am trying to support:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LoopNet + Crexi search
        |
normalized public listing rows
        |
dedupe / provenance / data quality notes
        |
CSV, Excel, JSON, or API
        |
broker-ready market proof file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;I packaged this workflow as an Apify actor:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;Commercial Real Estate Brokerage Intel&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The demo is here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/-9rSWW3B4ms" rel="noopener noreferrer"&gt;Watch the 2-minute demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I am also building market-specific proof files for Dallas, Austin, and Phoenix because the best proof is not "it scrapes."&lt;/p&gt;

&lt;p&gt;The best proof is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Here are the exact rows a CRE team can use.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  CRE Listing Intelligence Series
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/best-costar-alternatives-for-small-cre-brokers-52hn"&gt;Best CoStar alternatives for small CRE brokers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/how-to-scrape-loopnet-and-crexi-listings-into-one-cre-dataset-4l7k"&gt;How to scrape LoopNet and Crexi listings into one CRE dataset&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/boo_n/commercial-real-estate-api-for-brokers-cheap-costar-alternative-834"&gt;Commercial real estate API for brokers: cheap CoStar alternative&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LoopNet vs Crexi data: how to deduplicate listings&lt;/li&gt;
&lt;li&gt;How to build a daily CRE deal-flow dashboard&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>apify</category>
      <category>webscraping</category>
      <category>realestate</category>
      <category>automation</category>
    </item>
    <item>
      <title>Commercial Real Estate API for Brokers: Cheap CoStar Alternative</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Tue, 02 Jun 2026 01:18:31 +0000</pubDate>
      <link>https://dev.to/boo_n/commercial-real-estate-api-for-brokers-cheap-costar-alternative-834</link>
      <guid>https://dev.to/boo_n/commercial-real-estate-api-for-brokers-cheap-costar-alternative-834</guid>
      <description>&lt;h2&gt;
  
  
  Quick answer
&lt;/h2&gt;

&lt;p&gt;A commercial real estate API for brokers should return more than raw listing pages. It should return structured public listing data: listing URLs, addresses, asset class, asking price, square footage, cap rate context, days on market, broker contacts when publicly available, and duplicate signals across platforms.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel is an Apify actor built for that workflow. It combines LoopNet and Crexi research into one clean dataset that can be exported to CSV, Excel, JSON, Google Sheets, a CRM, or pulled through the Apify API.&lt;/p&gt;

&lt;p&gt;It is not a full enterprise research replacement. It is a lightweight CoStar alternative for a narrower job: public listing monitoring and broker-ready CRE deal flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why brokers look for a CoStar alternative
&lt;/h2&gt;

&lt;p&gt;CoStar is well known in commercial real estate. For enterprise teams, it can be a deep research platform.&lt;/p&gt;

&lt;p&gt;But many smaller brokers, acquisition analysts, and investors do not need a heavy system for every workflow.&lt;/p&gt;

&lt;p&gt;Often, they need something simpler:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search active public listings&lt;/li&gt;
&lt;li&gt;Compare asking prices&lt;/li&gt;
&lt;li&gt;Review cap rates&lt;/li&gt;
&lt;li&gt;Track days on market&lt;/li&gt;
&lt;li&gt;Find broker names and companies&lt;/li&gt;
&lt;li&gt;Capture public phone and email when exposed&lt;/li&gt;
&lt;li&gt;Export rows to a spreadsheet or CRM&lt;/li&gt;
&lt;li&gt;Repeat the search daily or weekly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For that use case, a pay-per-use commercial real estate listings scraper can be enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a useful CRE data API should return
&lt;/h2&gt;

&lt;p&gt;A useful commercial real estate API for brokers should expose fields that people can scan and systems can consume.&lt;/p&gt;

&lt;p&gt;Important output fields include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;source_platform&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;listing_url&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;property_name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;address&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;city&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;state&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;asset_class&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;asking_price_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;building_size_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;lot_size_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;price_per_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_listed&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_normalized&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_estimated&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_declared_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_implied_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_company&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_phone&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_email&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;also_listed_on&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is not only scraping data. The point is returning a clean dataset that can be sorted, filtered, exported, and reused.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Apify as a CRE API layer
&lt;/h2&gt;

&lt;p&gt;Apify actors can behave like lightweight APIs.&lt;/p&gt;

&lt;p&gt;You provide input, run the actor, and retrieve a dataset through the Apify platform or API.&lt;/p&gt;

&lt;p&gt;For commercial real estate, this can support workflows such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily market scans&lt;/li&gt;
&lt;li&gt;CRE broker leads&lt;/li&gt;
&lt;li&gt;New listing alerts&lt;/li&gt;
&lt;li&gt;Cap rate comparison tables&lt;/li&gt;
&lt;li&gt;Google Sheets dashboards&lt;/li&gt;
&lt;li&gt;CRM enrichment&lt;/li&gt;
&lt;li&gt;Internal underwriting queues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel searches LoopNet and Crexi, normalizes fields, deduplicates records, and returns public commercial real estate listing data in a repeatable format.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example API workflow
&lt;/h2&gt;

&lt;p&gt;Suppose a broker wants office and retail listings in Austin between $500K and $5M.&lt;/p&gt;

&lt;p&gt;Example input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"forSale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"forRent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourcesEnabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"loopnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Austin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assetClasses"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"office"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"retail"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priceMin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priceMax"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResultsPerSource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clean listing rows&lt;/li&gt;
&lt;li&gt;Cap rate data&lt;/li&gt;
&lt;li&gt;Days-on-market fields&lt;/li&gt;
&lt;li&gt;Broker contact fields&lt;/li&gt;
&lt;li&gt;Listing URLs&lt;/li&gt;
&lt;li&gt;Cross-platform duplicate markers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From there, the dataset can be exported as CSV or pulled into another workflow through the Apify API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why LoopNet + Crexi together matter
&lt;/h2&gt;

&lt;p&gt;LoopNet and Crexi are two important sources for active commercial real estate listings.&lt;/p&gt;

&lt;p&gt;A LoopNet scraper can help. A Crexi scraper can help. But a broker usually wants the combined view.&lt;/p&gt;

&lt;p&gt;The problem is that the same property can appear on both platforms.&lt;/p&gt;

&lt;p&gt;If the team exports both sources separately, duplicate properties can create messy workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The same deal gets reviewed twice&lt;/li&gt;
&lt;li&gt;The same broker gets contacted twice&lt;/li&gt;
&lt;li&gt;Market inventory appears larger than it is&lt;/li&gt;
&lt;li&gt;CRM data becomes harder to trust&lt;/li&gt;
&lt;li&gt;Analysts waste time cleaning rows instead of reviewing deals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A combined workflow should mark duplicate signals with fields such as &lt;code&gt;also_listed_on&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cap rate data API use case
&lt;/h2&gt;

&lt;p&gt;Cap rates are one of the fields brokers and investors want to compare quickly.&lt;/p&gt;

&lt;p&gt;But listing data is inconsistent.&lt;/p&gt;

&lt;p&gt;Some listings provide a cap rate directly. Some provide NOI but not cap rate. Some provide neither.&lt;/p&gt;

&lt;p&gt;A useful cap rate data API should preserve the difference between declared and estimated values.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel separates fields such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cap_rate_listed&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_normalized&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_estimated&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_declared_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_implied_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_source&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This helps analysts avoid mixing source-declared numbers with derived numbers without knowing where each value came from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Days on market commercial real estate use case
&lt;/h2&gt;

&lt;p&gt;Days on market commercial real estate data helps teams prioritize.&lt;/p&gt;

&lt;p&gt;A new listing may require fast action.&lt;/p&gt;

&lt;p&gt;A stale listing may suggest pricing pressure, lower demand, or a different outreach angle.&lt;/p&gt;

&lt;p&gt;When the source exposes usable listing-date context, the actor returns days-on-market fields. When the source does not expose enough context, the dataset should not pretend that it does.&lt;/p&gt;

&lt;h2&gt;
  
  
  Broker contact scraper use case
&lt;/h2&gt;

&lt;p&gt;Broker contact fields matter for lead generation and outreach review.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel returns broker names, brokerage companies, public phone numbers, and public emails when those details are exposed by the listing source.&lt;/p&gt;

&lt;p&gt;This makes the actor useful as a broker contact scraper for public listing pages, with one important caveat: users should follow applicable outreach rules and treat the data as public listing data, not private enrichment.&lt;/p&gt;

&lt;h2&gt;
  
  
  A cheap CoStar alternative workflow
&lt;/h2&gt;

&lt;p&gt;A lightweight public listing workflow might look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Schedule an Apify run every morning&lt;/li&gt;
&lt;li&gt;Search target markets such as Austin, Dallas, Phoenix, Miami, or Chicago&lt;/li&gt;
&lt;li&gt;Export unique listings from LoopNet and Crexi&lt;/li&gt;
&lt;li&gt;Send the dataset to Google Sheets or a CRM&lt;/li&gt;
&lt;li&gt;Filter by cap rate, price, asset class, and days on market&lt;/li&gt;
&lt;li&gt;Assign qualified opportunities for broker follow-up&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This does not replace every enterprise research feature.&lt;/p&gt;

&lt;p&gt;But it can replace the repetitive public listing workflow many small teams still do manually.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is a commercial real estate API for brokers?
&lt;/h3&gt;

&lt;p&gt;It is an API or dataset workflow that returns structured CRE listing data such as property details, prices, cap rates, days on market, listing URLs, broker contacts, and export-ready rows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this a CoStar alternative?
&lt;/h3&gt;

&lt;p&gt;It is a lightweight CoStar alternative for public listing monitoring from LoopNet and Crexi. It is not designed to replace every enterprise research feature.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can this work as a LoopNet scraper and Crexi scraper?
&lt;/h3&gt;

&lt;p&gt;Yes. The actor can search LoopNet, Crexi, or both sources together and export one normalized dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I export the data?
&lt;/h3&gt;

&lt;p&gt;Yes. Apify datasets can be exported as CSV, Excel, JSON, or accessed through API endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does it return broker contacts?
&lt;/h3&gt;

&lt;p&gt;It returns broker names, broker companies, public phone numbers, and public emails when those fields are exposed by the source listing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;The best commercial real estate API for a small brokerage is not always the biggest database.&lt;/p&gt;

&lt;p&gt;Sometimes it is the workflow that gives your team clean, usable listing data exactly when you need it.&lt;/p&gt;

&lt;p&gt;Try Commercial Real Estate Brokerage Intel on Apify:&lt;br&gt;
&lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch the 2-minute demo:&lt;br&gt;
&lt;a href="https://youtu.be/-9rSWW3B4ms" rel="noopener noreferrer"&gt;https://youtu.be/-9rSWW3B4ms&lt;/a&gt;&lt;/p&gt;

</description>
      <category>apify</category>
      <category>webscraping</category>
      <category>realestate</category>
      <category>automation</category>
    </item>
    <item>
      <title>How to Scrape LoopNet and Crexi Listings into One CRE Dataset</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Sun, 31 May 2026 20:33:42 +0000</pubDate>
      <link>https://dev.to/boo_n/how-to-scrape-loopnet-and-crexi-listings-into-one-cre-dataset-4l7k</link>
      <guid>https://dev.to/boo_n/how-to-scrape-loopnet-and-crexi-listings-into-one-cre-dataset-4l7k</guid>
      <description>&lt;h2&gt;
  
  
  Quick answer
&lt;/h2&gt;

&lt;p&gt;To scrape LoopNet and Crexi listings in a useful way, you need more than two raw scrapers. A practical workflow collects public commercial real estate listings from both sources, normalizes key fields, marks duplicate properties, exposes broker contact fields when the listing source provides them, and exports the result as CSV, Excel, JSON, or an API response.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel is an Apify actor built for this workflow: one run, two listing sources, one clean commercial real estate dataset.&lt;/p&gt;

&lt;p&gt;Most CRE brokers already know the manual version of this workflow.&lt;/p&gt;

&lt;p&gt;Open LoopNet. Search a market. Copy interesting listings. Open Crexi. Search the same market. Copy those listings too. Remove duplicate properties. Normalize cap rates. Check broker contacts. Export everything into a spreadsheet.&lt;/p&gt;

&lt;p&gt;That process works once.&lt;/p&gt;

&lt;p&gt;It becomes painful when you need to repeat it every day.&lt;/p&gt;

&lt;p&gt;This guide shows how to think about a LoopNet scraper and Crexi scraper workflow on Apify, with the goal of producing structured commercial real estate listing data rather than another messy export.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a useful CRE listing scraper should return
&lt;/h2&gt;

&lt;p&gt;A commercial real estate listings scraper should be judged by the dataset it returns.&lt;/p&gt;

&lt;p&gt;Useful fields include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;source_platform&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;listing_url&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;property_name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;address&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;city&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;state&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;asset_class&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;asking_price_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;building_size_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;lot_size_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;price_per_sqft&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_listed&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_normalized&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_estimated&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_declared_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_implied_usd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;noi_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_company&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_phone&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_email&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;also_listed_on&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The important part is not only collecting fields. The important part is making them consistent enough to scan, filter, export, and send into another system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why LoopNet + Crexi is harder than one source
&lt;/h2&gt;

&lt;p&gt;A basic LoopNet scraper or Crexi scraper only has to understand one source.&lt;/p&gt;

&lt;p&gt;A combined workflow has to handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different listing layouts&lt;/li&gt;
&lt;li&gt;Different field names&lt;/li&gt;
&lt;li&gt;Different price formats&lt;/li&gt;
&lt;li&gt;Missing cap rates&lt;/li&gt;
&lt;li&gt;Missing NOI&lt;/li&gt;
&lt;li&gt;Public broker contacts that appear on one source but not the other&lt;/li&gt;
&lt;li&gt;Duplicate properties across platforms&lt;/li&gt;
&lt;li&gt;Incomplete square footage or lot-size data&lt;/li&gt;
&lt;li&gt;Listings where the same value is expressed in different ways&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why “scrape both sites” is not the same as “create a usable CRE dataset.”&lt;/p&gt;

&lt;p&gt;For brokers and acquisition teams, the output needs to answer practical questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this property also listed somewhere else?&lt;/li&gt;
&lt;li&gt;Is the cap rate declared by the source or estimated from other values?&lt;/li&gt;
&lt;li&gt;How long has the listing been on market?&lt;/li&gt;
&lt;li&gt;Which broker or brokerage is attached to the public listing?&lt;/li&gt;
&lt;li&gt;Can I export this into Google Sheets, a CRM, or an underwriting model?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example Apify input
&lt;/h2&gt;

&lt;p&gt;Here is a simplified example of a market scan:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"forSale"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"forRent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourcesEnabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"loopnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Austin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assetClasses"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"office"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"retail"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priceMin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priceMax"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResultsPerSource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can run the same workflow for Dallas, Phoenix, Miami, Chicago, Los Angeles, New York, or any supported market.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step-by-step workflow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Choose the market
&lt;/h3&gt;

&lt;p&gt;Start with one city and state.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Austin, TX&lt;/li&gt;
&lt;li&gt;Dallas, TX&lt;/li&gt;
&lt;li&gt;Phoenix, AZ&lt;/li&gt;
&lt;li&gt;Miami, FL&lt;/li&gt;
&lt;li&gt;Chicago, IL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Focused runs are easier to validate before scaling to larger monitoring workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Select sources
&lt;/h3&gt;

&lt;p&gt;You can run:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LoopNet only&lt;/li&gt;
&lt;li&gt;Crexi only&lt;/li&gt;
&lt;li&gt;LoopNet and Crexi together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a daily CRE deal-flow workflow, both sources are usually more useful because they help you spot broader inventory and cross-platform duplicates.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Apply listing filters
&lt;/h3&gt;

&lt;p&gt;Useful filters include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For sale vs for rent&lt;/li&gt;
&lt;li&gt;Asset class&lt;/li&gt;
&lt;li&gt;City and state&lt;/li&gt;
&lt;li&gt;Minimum price&lt;/li&gt;
&lt;li&gt;Maximum price&lt;/li&gt;
&lt;li&gt;Maximum results per source&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is to keep each run targeted enough that the output is actionable.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Run the actor
&lt;/h3&gt;

&lt;p&gt;The actor collects public listings, normalizes fields, and writes the result to an Apify dataset.&lt;/p&gt;

&lt;p&gt;From Apify, users can export the dataset as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CSV&lt;/li&gt;
&lt;li&gt;Excel&lt;/li&gt;
&lt;li&gt;JSON&lt;/li&gt;
&lt;li&gt;API response&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes the workflow useful for Google Sheets, Airtable, CRMs, dashboards, and internal underwriting tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Review the dataset
&lt;/h3&gt;

&lt;p&gt;When checking the output, focus on the fields that change the workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cap_rate_normalized&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cap_rate_source&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_name&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_company&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_phone&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broker_email&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;also_listed_on&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These fields are what turn raw listing scraping into brokerage intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling cap rate and NOI correctly
&lt;/h2&gt;

&lt;p&gt;Cap rate data can be messy.&lt;/p&gt;

&lt;p&gt;Some listings provide a cap rate directly. Some provide NOI but not cap rate. Some provide neither. A useful cap rate data API should preserve that distinction.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_listed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;6.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_normalized"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;6.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"listed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_declared_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;312500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"listed"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a value is estimated, the dataset should make that clear:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_estimated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;5.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cap_rate_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"estimated"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_implied_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;290000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"noi_source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"implied"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That matters because brokers and analysts should not mix source-declared figures with derived figures without knowing the difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling days on market
&lt;/h2&gt;

&lt;p&gt;Days on market commercial real estate data is useful for prioritizing outreach.&lt;/p&gt;

&lt;p&gt;Recent listings may signal fresh opportunities. Older listings may signal stale inventory, pricing pressure, or a broker who may be more open to a conversation.&lt;/p&gt;

&lt;p&gt;When the source exposes usable listing-date context, the actor returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;days_on_market&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;days_on_market_source&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the source does not expose enough information, the dataset should not pretend that it does.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling broker contacts
&lt;/h2&gt;

&lt;p&gt;A broker contact scraper should be transparent about what it can collect.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel returns broker names, brokerage companies, phone numbers, and emails when those details are publicly exposed by the source listing.&lt;/p&gt;

&lt;p&gt;It is useful for CRE broker leads and outreach review, but it should be treated as public listing data, not private enrichment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling duplicates across LoopNet and Crexi
&lt;/h2&gt;

&lt;p&gt;LoopNet vs Crexi data often overlaps.&lt;/p&gt;

&lt;p&gt;The same property can appear on both platforms, sometimes with slightly different titles, prices, or broker details.&lt;/p&gt;

&lt;p&gt;A combined scraper should mark duplicate signals so the team does not chase the same property twice.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source_platform"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"loopnet"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"also_listed_on"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"also_listed_on_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crexi"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one of the main reasons to use a combined commercial real estate API instead of separate exports.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical use cases
&lt;/h2&gt;

&lt;p&gt;This workflow is useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily CRE deal-flow monitoring&lt;/li&gt;
&lt;li&gt;Broker lead lists&lt;/li&gt;
&lt;li&gt;Acquisition screening&lt;/li&gt;
&lt;li&gt;Market scans by city and asset class&lt;/li&gt;
&lt;li&gt;Cap rate comparison&lt;/li&gt;
&lt;li&gt;Days-on-market tracking&lt;/li&gt;
&lt;li&gt;CSV exports into Google Sheets&lt;/li&gt;
&lt;li&gt;CRM enrichment with public listing context&lt;/li&gt;
&lt;li&gt;Internal dashboards for active commercial real estate listings&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Can I scrape LoopNet and Crexi together?
&lt;/h3&gt;

&lt;p&gt;Yes. A combined Apify actor can collect public listing data from both sources and return one dataset with source platform, normalized fields, and duplicate signals.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this a LoopNet scraper or a Crexi scraper?
&lt;/h3&gt;

&lt;p&gt;It can work as both. Commercial Real Estate Brokerage Intel is designed as a LoopNet scraper, Crexi scraper, commercial real estate listings scraper, and lightweight commercial real estate API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does the actor include broker contact data?
&lt;/h3&gt;

&lt;p&gt;It can include broker name, broker company, phone, and email when those details are publicly exposed by the listing source.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I export the results?
&lt;/h3&gt;

&lt;p&gt;Yes. Apify datasets can be exported as CSV, Excel, JSON, or accessed through the API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this a CoStar alternative?
&lt;/h3&gt;

&lt;p&gt;It is a lightweight CoStar alternative for one specific workflow: public listing monitoring from LoopNet and Crexi. It is not positioned as a full enterprise research suite.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final workflow
&lt;/h2&gt;

&lt;p&gt;The repeatable workflow is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Choose a market&lt;/li&gt;
&lt;li&gt;Select LoopNet, Crexi, or both&lt;/li&gt;
&lt;li&gt;Apply filters&lt;/li&gt;
&lt;li&gt;Run the actor&lt;/li&gt;
&lt;li&gt;Review cap rates, days on market, duplicate signals, and broker contacts&lt;/li&gt;
&lt;li&gt;Export to CSV, JSON, Google Sheets, a CRM, or a dashboard&lt;/li&gt;
&lt;li&gt;Schedule the same run daily or weekly&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If your team lives in LoopNet and Crexi, this turns manual listing research into a repeatable CRE data pipeline.&lt;/p&gt;

&lt;p&gt;Try Commercial Real Estate Brokerage Intel on Apify:&lt;br&gt;
&lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch the 2-minute demo:&lt;br&gt;
&lt;a href="https://youtu.be/-9rSWW3B4ms" rel="noopener noreferrer"&gt;https://youtu.be/-9rSWW3B4ms&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>apify</category>
      <category>realestate</category>
      <category>automation</category>
    </item>
    <item>
      <title>Best CoStar Alternatives for Small CRE Brokers</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Sat, 30 May 2026 19:47:37 +0000</pubDate>
      <link>https://dev.to/boo_n/best-costar-alternatives-for-small-cre-brokers-52hn</link>
      <guid>https://dev.to/boo_n/best-costar-alternatives-for-small-cre-brokers-52hn</guid>
      <description>&lt;h2&gt;
  
  
  Quick answer
&lt;/h2&gt;

&lt;p&gt;The best CoStar alternative for small CRE brokers is often not one single replacement database. For teams focused on public listing monitoring, the practical alternative is a workflow that combines a LoopNet scraper, a Crexi scraper, deduplication, cap rate normalization, days-on-market tracking, broker contact extraction, and CSV/API export.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel is built for that narrower workflow: one Apify run, one clean dataset, and a lighter way to monitor commercial real estate listings without manually switching between LoopNet and Crexi.&lt;/p&gt;

&lt;p&gt;For many commercial real estate brokers, CoStar is the default name that comes up when the conversation turns to CRE data. It is powerful, widely known, and deeply embedded in the industry.&lt;/p&gt;

&lt;p&gt;But it is not always the right fit for small brokerage teams.&lt;/p&gt;

&lt;p&gt;If your team mainly needs to monitor active commercial real estate listings, compare asking prices, review cap rates, find broker contacts, and export data into a spreadsheet or CRM, a heavy enterprise platform can be more than you need.&lt;/p&gt;

&lt;p&gt;This is where lighter CoStar alternatives become useful.&lt;/p&gt;

&lt;p&gt;They do not all replace the same thing. Some tools are databases. Some are marketplaces. Some are APIs. Some are scrapers. Some help with off-market ownership data, while others focus on active listings and deal flow.&lt;/p&gt;

&lt;p&gt;The best choice depends on what you actually need.&lt;/p&gt;

&lt;h2&gt;
  
  
  What small CRE brokers usually need
&lt;/h2&gt;

&lt;p&gt;Most small CRE brokers do not need every enterprise feature on day one. They usually need a practical workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Find public commercial real estate listings in a target market&lt;/li&gt;
&lt;li&gt;Search LoopNet and Crexi without living in two browser tabs&lt;/li&gt;
&lt;li&gt;Compare asking price, square footage, price per square foot, and cap rate&lt;/li&gt;
&lt;li&gt;Track days on market when the listing source exposes it&lt;/li&gt;
&lt;li&gt;Pull broker names, companies, and public phone or email when available&lt;/li&gt;
&lt;li&gt;Export clean CSV, Excel, or JSON data&lt;/li&gt;
&lt;li&gt;Push records into Google Sheets, a CRM, or an underwriting model&lt;/li&gt;
&lt;li&gt;Repeat the same search daily or weekly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If that sounds like your workflow, you should not only compare brands. You should compare outputs.&lt;/p&gt;

&lt;p&gt;The question is not “Which platform is the biggest?”&lt;/p&gt;

&lt;p&gt;The better question is: “Which tool gives my team the cleanest data for the work we actually do?”&lt;/p&gt;

&lt;h2&gt;
  
  
  Categories of CoStar alternatives
&lt;/h2&gt;

&lt;p&gt;Here is the practical comparison for small brokerage teams:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Limitation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CoStar-style enterprise platforms&lt;/td&gt;
&lt;td&gt;Broad market intelligence and institutional workflows&lt;/td&gt;
&lt;td&gt;Can be more than a small team needs for public listing monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LoopNet and Crexi marketplaces&lt;/td&gt;
&lt;td&gt;Finding active public listings&lt;/td&gt;
&lt;td&gt;Data stays split across tabs and exports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Off-market data platforms&lt;/td&gt;
&lt;td&gt;Ownership research and prospecting&lt;/td&gt;
&lt;td&gt;Not always focused on active listings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual spreadsheets&lt;/td&gt;
&lt;td&gt;Simple early workflows&lt;/td&gt;
&lt;td&gt;Slow, inconsistent, and hard to repeat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CRE listing scrapers and APIs&lt;/td&gt;
&lt;td&gt;Repeatable deal-flow data pipelines&lt;/td&gt;
&lt;td&gt;Quality depends on normalization and deduplication&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  1. Public listing marketplaces
&lt;/h3&gt;

&lt;p&gt;LoopNet and Crexi are the obvious places to start. They are widely used by brokers, investors, and property owners to discover active commercial real estate listings.&lt;/p&gt;

&lt;p&gt;They are useful because they contain public listings directly relevant to active deal flow.&lt;/p&gt;

&lt;p&gt;The downside is workflow friction. Brokers often search the same market on both platforms, copy listing data into a spreadsheet, clean duplicate properties, and manually compare cap rates or broker contacts.&lt;/p&gt;

&lt;p&gt;That is fine for a few listings. It breaks down when you need repeatable market scans.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Off-market databases
&lt;/h3&gt;

&lt;p&gt;Tools focused on ownership data, property records, transactions, and off-market intelligence can be valuable for prospecting. They may help with owner outreach, portfolio research, or market mapping.&lt;/p&gt;

&lt;p&gt;But if your immediate job is to monitor active listings from LoopNet and Crexi, an off-market database is not always the fastest route.&lt;/p&gt;

&lt;p&gt;For active listing workflow, you need structured listing data more than broad property intelligence.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Spreadsheet and CRM workflows
&lt;/h3&gt;

&lt;p&gt;Some teams solve the problem internally. They build Google Sheets templates, use virtual assistants, or manually copy data from listing sites.&lt;/p&gt;

&lt;p&gt;This can work early on, but manual data collection creates hidden costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inconsistent columns&lt;/li&gt;
&lt;li&gt;Duplicate rows&lt;/li&gt;
&lt;li&gt;Missing broker contacts&lt;/li&gt;
&lt;li&gt;Cap rates formatted differently&lt;/li&gt;
&lt;li&gt;Days-on-market fields that are hard to compare&lt;/li&gt;
&lt;li&gt;No reliable repeatability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manual workflows are cheap until they become the bottleneck.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Commercial real estate APIs and scrapers
&lt;/h3&gt;

&lt;p&gt;For small teams, a lightweight commercial real estate API or listings scraper can be the most practical middle ground.&lt;/p&gt;

&lt;p&gt;Instead of paying for a heavy enterprise interface, you run a targeted workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Choose a market&lt;/li&gt;
&lt;li&gt;Select sources such as LoopNet and Crexi&lt;/li&gt;
&lt;li&gt;Apply filters&lt;/li&gt;
&lt;li&gt;Export a clean dataset&lt;/li&gt;
&lt;li&gt;Send results into existing tools&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This gives brokers a repeatable data pipeline without requiring a full software migration.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical CoStar alternative for public listing workflows
&lt;/h2&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel is an Apify actor built for this exact use case.&lt;/p&gt;

&lt;p&gt;It works as a LoopNet scraper, Crexi scraper, commercial real estate listings scraper, and lightweight commercial real estate API for brokers who need structured public listing data.&lt;/p&gt;

&lt;p&gt;Instead of manually switching between LoopNet and Crexi, the actor runs one search and returns one clean dataset.&lt;/p&gt;

&lt;p&gt;The output can include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Asking price&lt;/li&gt;
&lt;li&gt;Square footage&lt;/li&gt;
&lt;li&gt;Price per square foot&lt;/li&gt;
&lt;li&gt;Asset class&lt;/li&gt;
&lt;li&gt;Listing URL&lt;/li&gt;
&lt;li&gt;Source platform&lt;/li&gt;
&lt;li&gt;Cap rate data&lt;/li&gt;
&lt;li&gt;NOI provenance&lt;/li&gt;
&lt;li&gt;Days on market when available&lt;/li&gt;
&lt;li&gt;Broker name&lt;/li&gt;
&lt;li&gt;Broker company&lt;/li&gt;
&lt;li&gt;Public phone or email when exposed by the source&lt;/li&gt;
&lt;li&gt;Cross-platform duplicate signals with &lt;code&gt;also_listed_on&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is especially useful for CRE broker leads, acquisition lists, and market monitoring.&lt;/p&gt;

&lt;p&gt;In other words, it is not trying to be a full enterprise research suite. It is trying to be the clean data layer between public CRE marketplaces and the tools your team already uses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why deduplication matters
&lt;/h2&gt;

&lt;p&gt;One of the biggest problems with using LoopNet and Crexi together is duplicate inventory.&lt;/p&gt;

&lt;p&gt;The same property can appear on both platforms. If you export both sources manually, your team may chase the same deal twice, underwrite duplicate rows, or overestimate market inventory.&lt;/p&gt;

&lt;p&gt;A good commercial real estate listings scraper should not only collect rows. It should help identify duplicates.&lt;/p&gt;

&lt;p&gt;Commercial Real Estate Brokerage Intel uses a deduplication layer to group likely matching properties and mark where else the same listing appears.&lt;/p&gt;

&lt;p&gt;That is the difference between raw scraping and brokerage intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  When CoStar still makes sense
&lt;/h2&gt;

&lt;p&gt;CoStar can still make sense for larger organizations that need deep enterprise tooling, historical databases, research products, team-wide workflows, and broader institutional coverage.&lt;/p&gt;

&lt;p&gt;This article is not arguing that every team should replace CoStar.&lt;/p&gt;

&lt;p&gt;The point is narrower:&lt;/p&gt;

&lt;p&gt;If your main workflow is public listing monitoring, broker lead generation, cap rate comparison, and spreadsheet exports, a lightweight workflow may be enough.&lt;/p&gt;

&lt;p&gt;For many small CRE brokers, “enough” is the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommended workflow
&lt;/h2&gt;

&lt;p&gt;If you are evaluating alternatives, try this simple test:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pick one market: Austin, Dallas, Phoenix, Miami, or your local target market.&lt;/li&gt;
&lt;li&gt;Run the same search on LoopNet and Crexi.&lt;/li&gt;
&lt;li&gt;Export or collect the listings.&lt;/li&gt;
&lt;li&gt;Compare how long it takes to clean duplicates.&lt;/li&gt;
&lt;li&gt;Count how many broker contact fields you can use.&lt;/li&gt;
&lt;li&gt;Check whether cap rate and days-on-market fields are easy to compare.&lt;/li&gt;
&lt;li&gt;Ask whether your team can repeat this every morning.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the answer is no, you need a more structured workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the best CoStar alternative for small CRE brokers?
&lt;/h3&gt;

&lt;p&gt;For small CRE brokers focused on active public listings, the best CoStar alternative may be a listing-data workflow rather than another large database. A commercial real estate listings scraper can collect LoopNet and Crexi results, normalize fields, deduplicate listings, and export data to CSV, JSON, Google Sheets, or a CRM.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I scrape LoopNet and Crexi together?
&lt;/h3&gt;

&lt;p&gt;Yes, with a workflow built for public listing monitoring. Commercial Real Estate Brokerage Intel is designed to combine LoopNet and Crexi results into one dataset, while marking source platform and duplicate signals where possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  What fields matter most in a commercial real estate API?
&lt;/h3&gt;

&lt;p&gt;For brokers, the most useful fields are asking price, square footage, price per square foot, cap rate data, days on market, asset class, listing URL, source platform, broker contacts, and &lt;code&gt;also_listed_on&lt;/code&gt; duplicate signals.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is broker contact data included?
&lt;/h3&gt;

&lt;p&gt;The actor can include broker name, company, phone, and email when those fields are publicly exposed by the source listing. It should be treated as a broker contact scraper for public listing pages, not a private contact database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is cap rate data declared or estimated?
&lt;/h3&gt;

&lt;p&gt;The dataset separates cap rate and NOI context where possible. When NOI or cap rate is estimated rather than explicitly declared by the source, the output should make that provenance clear so brokers do not mix source-provided figures with derived figures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;The best CoStar alternative for a small CRE broker is not always a giant database.&lt;/p&gt;

&lt;p&gt;Sometimes it is a clean, repeatable pipeline that turns public listings into usable data.&lt;/p&gt;

&lt;p&gt;If your team lives in LoopNet and Crexi, Commercial Real Estate Brokerage Intel gives you one Apify run and one clean dataset.&lt;/p&gt;

&lt;p&gt;Try it here:&lt;br&gt;
&lt;a href="https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od" rel="noopener noreferrer"&gt;https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Watch the demo:&lt;br&gt;
&lt;a href="https://youtu.be/-9rSWW3B4ms" rel="noopener noreferrer"&gt;https://youtu.be/-9rSWW3B4ms&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>3 Months Shipping My First Apify Actor: 64 Users, $200/mo, and Everything I Got Wrong</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Thu, 21 May 2026 01:42:17 +0000</pubDate>
      <link>https://dev.to/boo_n/3-months-shipping-my-first-apify-actor-64-users-200mo-and-everything-i-got-wrong-52i7</link>
      <guid>https://dev.to/boo_n/3-months-shipping-my-first-apify-actor-64-users-200mo-and-everything-i-got-wrong-52i7</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR — 90 days of an Apify Store actor, by the numbers&lt;/strong&gt; &lt;em&gt;(as of May 2026)&lt;/em&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Time since launch&lt;/td&gt;
&lt;td&gt;3 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifetime users&lt;/td&gt;
&lt;td&gt;64&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly active users&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revenue (last 30 days)&lt;/td&gt;
&lt;td&gt;~$200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Successful runs (30d)&lt;/td&gt;
&lt;td&gt;212 / 230 (92%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Articles published trying to grow&lt;/td&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total views across articles&lt;/td&gt;
&lt;td&gt;76&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardest bug&lt;/td&gt;
&lt;td&gt;Silent success: runs "succeeded" with 0 items&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Biggest fix&lt;/td&gt;
&lt;td&gt;Country-bound residential proxies (60% → 95% success rate)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Biggest pricing mistake&lt;/td&gt;
&lt;td&gt;$0.30 per-run start fee killing 58% of customers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Honest writeup. Nothing here is a flex.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I shipped my first paid actor on &lt;a href="https://apify.com/store?fpr=8fp2od" rel="noopener noreferrer"&gt;Apify Store&lt;/a&gt; in February 2026. The actor is a &lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted scraper&lt;/a&gt; — paste a search URL, get JSON listings, export to CSV/Excel. Niche, but real product-market fit.&lt;/p&gt;

&lt;p&gt;Three months in, the dashboard tells me 64 people have used it, 13 are active this month, and I'm netting roughly $200/month after Apify's 20% platform cut. I published 28 articles trying to drive growth and the combined view count across all of them is &lt;strong&gt;76 views&lt;/strong&gt;. Most articles got 0.&lt;/p&gt;

&lt;p&gt;If you're thinking about shipping a paid actor on Apify Store — or any indie SaaS where you don't control the distribution platform — this is what 90 days of that experience actually looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  Month 1: Shipping the MVP took 3 weeks. Most of it was anti-bot.
&lt;/h2&gt;

&lt;p&gt;Vinted has no public API. The mechanics of scraping it are well-understood by anyone who's tried — &lt;a href="https://datadome.co" rel="noopener noreferrer"&gt;Datadome&lt;/a&gt; guards every catalog page, and any sequence of plain HTTP requests gets you a 403 in under 60 seconds.&lt;/p&gt;

&lt;p&gt;The pattern that ended up working (and that I'd recommend if you're building anything against a Datadome-protected site in 2026):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Open the catalog page ONCE in a real Playwright browser&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;playwright&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
  &lt;span class="na"&gt;proxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;residentialProxyUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; 
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newContext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://www.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/catalog?&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.catalog-wrapper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Datadome runs its JS challenge against a real Chromium environment&lt;/span&gt;
&lt;span class="c1"&gt;// with a coherent fingerprint. Cookies land in the page context.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ua&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// 2. Reuse those cookies in a fast HTTP loop with got-scraping&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cookieHeader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;gotScraping&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`https://www.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/api/v2/catalog/items?&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;page=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;amp;per_page=96`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;Cookie&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;cookieHeader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User-Agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ua&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;Referer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`https://www.&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;proxyUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;residentialProxyUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;responseType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusCode&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pushData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transformItem&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The browser does the unlock. The HTTP client does the volume. ~10× faster than driving the browser for every page. Throughput goes from ~50 items/min to ~500 items/min.&lt;/p&gt;

&lt;p&gt;What I didn't realize at this stage: solving Datadome in dev mode (local laptop, residential IP I owned) is a completely different game from solving it in production for paying customers in different countries. That fact would cost me 76% of my users in months 2-3.&lt;/p&gt;

&lt;h2&gt;
  
  
  Month 2: Quietly losing 76% of my users
&lt;/h2&gt;

&lt;p&gt;By week 6, I had 51 lifetime users on the actor. Apify Store's dashboard showed a 91% success rate. The monthly active user count was sitting at 30.&lt;/p&gt;

&lt;p&gt;By week 10, monthly active was at 12.&lt;/p&gt;

&lt;p&gt;Same 91% success rate. Same product. Two-thirds of users gone, quietly.&lt;/p&gt;

&lt;p&gt;Apify Store displays success rate, total runs, and MAU on every actor's public page. It does NOT display a "users who came once and never returned" metric. The metric I cared most about didn't exist anywhere in the dashboard.&lt;/p&gt;

&lt;p&gt;I dug into the run-level data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;143 SUCCEEDED runs&lt;/strong&gt; in 30 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;11 ABORTED runs&lt;/strong&gt; — users hitting Cancel themselves because they saw nothing happening&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3 FAILED runs&lt;/strong&gt; — visible failures with errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A success at the &lt;em&gt;process&lt;/em&gt; level (&lt;code&gt;exitCode 0&lt;/code&gt;, run finished cleanly) doesn't mean a success at the &lt;em&gt;product&lt;/em&gt; level (customer got data they paid for). The status display only knows about the former.&lt;/p&gt;

&lt;p&gt;The root cause: when Datadome served a challenge page (its JS proof-of-work + fingerprint check), my scraper waited 15 seconds for a catalog selector that would never appear, then logged a warning and &lt;em&gt;continued&lt;/em&gt; with whatever cookies it had. Those cookies were from the challenge state, not from a real authenticated session. The subsequent API call returned an empty array. The actor exited with &lt;code&gt;exitCode 0&lt;/code&gt;. Apify reported SUCCEEDED.&lt;/p&gt;

&lt;p&gt;From the user's perspective: open the dashboard, see ✅ Succeeded, click the dataset, see nothing. They don't file a bug. They just don't come back.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What I had — silently continues even when the challenge is unsolved&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.catalog-wrapper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Selector timeout — continuing with current cookies&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// ← bad assumption&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// challenge cookies, not auth&lt;/span&gt;
&lt;span class="c1"&gt;// ...later: API returns 0 items, run "succeeds"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The 4-line fix that should have been there from day 1
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isDatadomeChallenge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;captcha-delivery.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dd_cookie_test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;access denied&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;isDatadomeChallenge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retire&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Datadome challenge detected — retrying with new session.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Plus a final-state assertion at the end of every run:&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totalItems&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Zero items extracted. The Vinted page returned no results, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;or anti-bot blocked all attempts. Verify the URL or try again.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The throw at the end converts a silent SUCCEEDED-with-empty-dataset into a loud FAILED with an actionable message. Customers who used to open an empty dataset and churn now see a clear error and either retry, fix their URL, or contact support. None of those outcomes are silent.&lt;/p&gt;

&lt;p&gt;ABORTED runs in the next 14 days after deployment: 0.&lt;/p&gt;

&lt;h2&gt;
  
  
  The other half of the retention drop: country-bound proxies
&lt;/h2&gt;

&lt;p&gt;Apify's residential proxy pool, by default, rotates across all available countries. So a German user's URL request might be served by a US-based residential IP. Vinted geo-routes when the IP doesn't match the domain — sometimes returning empty results, sometimes 403, sometimes a different country's inventory. Datadome flags the mismatched-IP pattern as suspicious and challenges harder.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TLD_TO_COUNTRY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;fr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;de&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;es&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ES&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;it&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;nl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;be&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;cz&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CZ&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;sk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;hu&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HU&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ro&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RO&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;hr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;se&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ee&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ie&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lu&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LU&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LV&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;si&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;co.uk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;TLD_TO_COUNTRY&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proxyConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createProxyConfiguration&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;useApifyProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apifyProxyGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RESIDENTIAL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;countryCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// bind to URL TLD&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this one config change, success rate on non-French markets went from ~60% to &amp;gt;95%. Same code, same actor, same Datadome — just one parameter aligning IP nationality with the URL's intended market.&lt;/p&gt;

&lt;h2&gt;
  
  
  Month 3: The pricing rebalance
&lt;/h2&gt;

&lt;p&gt;After the reliability fixes I looked at pricing. Original model: &lt;strong&gt;$0.30 per run start + $0.0015 per result&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For a 25-item run, the customer paid $0.34 — an effective rate of &lt;strong&gt;$13.60 per 1,000 items&lt;/strong&gt;, far above the market rate ($0.50–$3.50 per 1,000). Worth noting: &lt;strong&gt;58% of my runs were under 25 items&lt;/strong&gt;. Most of my customers were the small-batch monitoring users I was effectively pricing out.&lt;/p&gt;

&lt;p&gt;The new pricing: &lt;strong&gt;$0.04 per GB of memory at start (= $0.08 in 2 GB) + $0.0035 per result&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Old price&lt;/th&gt;
&lt;th&gt;New price&lt;/th&gt;
&lt;th&gt;Δ for customer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;25 items&lt;/td&gt;
&lt;td&gt;$0.34&lt;/td&gt;
&lt;td&gt;$0.17&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-50%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 items&lt;/td&gt;
&lt;td&gt;$0.45&lt;/td&gt;
&lt;td&gt;$0.43&lt;/td&gt;
&lt;td&gt;-4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;200 items&lt;/td&gt;
&lt;td&gt;$0.60&lt;/td&gt;
&lt;td&gt;$0.78&lt;/td&gt;
&lt;td&gt;+30%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$1.80&lt;/td&gt;
&lt;td&gt;$3.58&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+99%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Modeled on 90 days of historical run data, total revenue projection is roughly the same. The distribution just shifted toward customer fairness: small monitoring runs are cheap enough not to trigger sticker shock; bulk extractions pay fairly for the value delivered.&lt;/p&gt;

&lt;p&gt;Existing users keep the old pricing for 14 days — Apify's pricing-schedule policy automatically notifies them by email.&lt;/p&gt;

&lt;h2&gt;
  
  
  Month 3.5: 28 articles → 76 views
&lt;/h2&gt;

&lt;p&gt;This is the painful part.&lt;/p&gt;

&lt;p&gt;I wrote 28 articles trying to drive traffic to the actor. Topics: how to scrape Vinted, Datadome bypass patterns, integration tutorials for Discord/Telegram/Make.com, comparison reviews, etc. I cross-posted on dev.to, Hashnode, and Medium.&lt;/p&gt;

&lt;p&gt;Total combined view count across all 28 articles: &lt;strong&gt;76 views&lt;/strong&gt;. The most-viewed article got 11.&lt;/p&gt;

&lt;p&gt;What went wrong:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Topic saturation.&lt;/strong&gt; Publishing 28 articles about "Vinted scraping" on one account in 90 days deboosts the algo. Dev.to's recommendation engine flags accounts that flood one topic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No follower base.&lt;/strong&gt; A new dev.to account with 0 followers depends on the global feed, which moves faster than anyone can read.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wrong tags.&lt;/strong&gt; I used &lt;code&gt;webscraping&lt;/code&gt;, &lt;code&gt;saas&lt;/code&gt;, &lt;code&gt;productivity&lt;/code&gt; — the most saturated tags on the platform.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-posting deboost.&lt;/strong&gt; Cross-posting to Medium with &lt;code&gt;canonical_url&lt;/code&gt; set creates duplicate content signals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No engagement loop.&lt;/strong&gt; I never commented on other people's articles. The algo rewards reciprocal engagement.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The honest lesson: content distribution is a &lt;em&gt;platform-specific&lt;/em&gt; discipline. Writing good articles is necessary but nowhere near sufficient. Knowing how the algo prioritizes you is the actual skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I'd do differently if I shipped a new actor tomorrow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build a "useful-but-zero-items" failure detector from day 1.&lt;/strong&gt; A &lt;code&gt;exitCode 0&lt;/code&gt; with empty output is the worst-case retention killer. Fail loud, always.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Country-bind residential proxies for any geo-routed product.&lt;/strong&gt; Datadome, Cloudflare, Akamai — they all flag mismatched IP/country pairs. Two lines of config. 35-point swing in success rate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Match pricing to the smallest unit a customer wants to buy, not the average.&lt;/strong&gt; A $0.30 start fee on $0.50 jobs is a 60% tax. Reduce start fees aggressively, charge for value delivered.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don't publish 28 articles on one platform with the same topic.&lt;/strong&gt; Pick 1-2 platforms, post weekly, build a follower base, comment on others' work. Distribution &amp;gt; volume.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add a 2-minute video tutorial early.&lt;/strong&gt; Most users won't read your README. A YouTube link saying "watch this 2-min walkthrough" converts way better than 1,000 words of docs. My video is at &lt;a href="https://youtu.be/rWtZVDMflbo" rel="noopener noreferrer"&gt;youtu.be/rWtZVDMflbo&lt;/a&gt; — it took me 30 minutes to record and is now linked from every CTA.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ship with affiliate links from day 1.&lt;/strong&gt; Apify has a partner program (&lt;a href="https://apify.com/partners/affiliate" rel="noopener noreferrer"&gt;apify.com/partners/affiliate&lt;/a&gt;) that pays 20-30% recurring on referred customers. Stack it on top of your actor revenue. I added mine months in — pure money left on the table for 90 days.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What I'd recommend if you want to try Apify Store yourself
&lt;/h2&gt;

&lt;p&gt;The platform itself is solid for indie developers — you get Cloud infrastructure, pay-per-event billing, residential proxies, scheduler, dataset storage, and webhooks all included. Pay-Per-Result pricing means you only earn revenue when customers actually get value, which forces good UX.&lt;/p&gt;

&lt;p&gt;If you want to see what a finished actor looks like in practice, my &lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Turbo Scraper&lt;/a&gt; is publicly available — paste a Vinted search URL, get JSON listings. There's a 2-minute video walkthrough at &lt;a href="https://youtu.be/rWtZVDMflbo" rel="noopener noreferrer"&gt;youtu.be/rWtZVDMflbo&lt;/a&gt; and the open-source integration examples (curl, Node, Python, batch, scheduling) are at &lt;a href="https://github.com/Boo-n/vinted-turbo-scraper" rel="noopener noreferrer"&gt;github.com/Boo-n/vinted-turbo-scraper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Apify's Free plan includes $5/month of platform credits — enough to test any actor on the Store without committing a dime.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How long does it really take to ship an Apify actor?
&lt;/h3&gt;

&lt;p&gt;For a moderate-complexity scraper with anti-bot handling: 2-4 weeks for MVP, another 2-4 weeks for production hardening (proxy logic, retry policies, monitoring). The Apify SDK + Crawlee framework handles a lot of the plumbing — you mostly write the scraping logic itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much can you actually earn?
&lt;/h3&gt;

&lt;p&gt;Wide range. From the public stats on Apify Store, most paid actors do $50-500/month. Top actors (with strong distribution + product-market fit) do $5k-30k/month. My actor is at the low end ($200/month) because my distribution is weak, not because the product is bad.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is it worth it as a side project?
&lt;/h3&gt;

&lt;p&gt;For most developers, the value isn't direct revenue — it's the SaaS shipping experience without having to build billing, auth, hosting, or proxy infrastructure. You get a real product with paying customers in 4-6 weeks. That's worth doing once even at $50/month.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can you scrape Vinted legally?
&lt;/h3&gt;

&lt;p&gt;Public catalog data (titles, prices, photos shown to anonymous browsers) sits in a grey area in most jurisdictions. Personal seller data is protected under GDPR and should not be redistributed without a lawful basis. The actor I built only extracts public catalog data anonymously.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's the biggest mistake to avoid?
&lt;/h3&gt;

&lt;p&gt;The "silent success" failure mode. Whatever you ship, add an explicit assertion that the user got value at the end of every workflow. A clean &lt;code&gt;exitCode 0&lt;/code&gt; with no data is the worst possible customer experience because it's invisible.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Last verified: May 2026. Open to feedback — drop a comment if you've shipped on Apify Store and your numbers differ from mine.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>saas</category>
      <category>buildinpublic</category>
      <category>indiehackers</category>
      <category>webscraping</category>
    </item>
    <item>
      <title>How to scrape any Shopify store's apps + product catalog in one API call (full tutorial)</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Thu, 21 May 2026 01:34:27 +0000</pubDate>
      <link>https://dev.to/boo_n/how-to-scrape-any-shopify-stores-apps-product-catalog-in-one-api-call-full-tutorial-1m24</link>
      <guid>https://dev.to/boo_n/how-to-scrape-any-shopify-stores-apps-product-catalog-in-one-api-call-full-tutorial-1m24</guid>
      <description>&lt;p&gt;&lt;a href="https://youtu.be/jxpSVYvZBFw" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekj5k0a1cby3fu6h6f4l.jpg" alt="Watch the 3-minute walkthrough"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;▶️ &lt;em&gt;3-minute video walkthrough: input → run → dataset → API call.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Step-by-step walkthrough: paste a list of Shopify URLs, get back products + installed apps + reviews in JSON. No headless browser, ~$0.005 per store, runs in batch. We'll cover the input schema, three real use-cases (ICP qualification, cold outbound personalization, app market-share research), and the cost math. Live on Apify Store: &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy?fpr=8fp2od" rel="noopener noreferrer"&gt;Shopify Apps Spy + Product Scraper&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What you'll have at the end of this tutorial
&lt;/h2&gt;

&lt;p&gt;A working pipeline that turns this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://allbirds.com
https://gymshark.com
https://manukora.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…into a CSV like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;store_domain,product_title,price,available,email_app,reviews_app,subs_app,product_url
allbirds.com,Wool Runner,110,true,Klaviyo,Yotpo,,https://allbirds.com/products/...
gymshark.com,Vital Seamless Bra,40,true,Klaviyo,Stamped,ReCharge,https://gymshark.com/...
manukora.com,UMF 20+ Honey,109,true,Postscript,Judge.me,Skio,https://manukora.com/...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Total time end-to-end: about 3 minutes for 50 stores.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1 — Sign up to Apify (free, no card needed)
&lt;/h2&gt;

&lt;p&gt;Go to &lt;a href="https://apify.com/sign-up" rel="noopener noreferrer"&gt;apify.com&lt;/a&gt; and create an account. You get &lt;strong&gt;$5 of free credit on signup&lt;/strong&gt;, which covers about 1,500 store scans at the standard tier. No credit card required.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2 — Open the actor
&lt;/h2&gt;

&lt;p&gt;Navigate to &lt;strong&gt;&lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy?fpr=8fp2od" rel="noopener noreferrer"&gt;Shopify Apps Spy + Product Scraper&lt;/a&gt;&lt;/strong&gt; on Apify Store.&lt;/p&gt;

&lt;p&gt;Click &lt;strong&gt;"Try for free"&lt;/strong&gt; → the actor opens in your console with a default input ready to run.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3 — Configure the input
&lt;/h2&gt;

&lt;p&gt;The input has 9 fields. The 3 you care about for your first run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"store_urls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://allbirds.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://gymshark.com"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"extract_level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"standard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_products_per_store"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;store_urls&lt;/code&gt;&lt;/strong&gt;: list of Shopify store URLs. Works with any custom domain, &lt;code&gt;*.myshopify.com&lt;/code&gt; URLs, or store homepages. Cap is 100 stores per single run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;extract_level&lt;/code&gt;&lt;/strong&gt;: choose what to pull.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Outputs&lt;/th&gt;
&lt;th&gt;Cost per store (avg)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;basic&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;products only&lt;/td&gt;
&lt;td&gt;$0.001&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;standard&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;products + apps installed&lt;/td&gt;
&lt;td&gt;$0.005&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;full&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;+ reviews from detected app&lt;/td&gt;
&lt;td&gt;$0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pro&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;+ revenue estimation (placeholder, J4)&lt;/td&gt;
&lt;td&gt;TBD&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For 95% of use-cases, &lt;code&gt;standard&lt;/code&gt; is the sweet spot — you get the apps stack which is the actual signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;max_products_per_store&lt;/code&gt;&lt;/strong&gt;: cap to avoid runaway costs on 50,000-product mega-stores. Default 250.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4 — Run it
&lt;/h2&gt;

&lt;p&gt;Click &lt;strong&gt;Save &amp;amp; Start&lt;/strong&gt; → the actor boots, scrapes, and dumps the output to the &lt;strong&gt;default dataset&lt;/strong&gt; (top right of your console).&lt;/p&gt;

&lt;p&gt;For 2-store input, finishes in about 5 seconds. The dataset view auto-refreshes — you'll see products appear in real-time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5 — Export the dataset
&lt;/h2&gt;

&lt;p&gt;Top-right of the dataset view, click &lt;strong&gt;Export&lt;/strong&gt; → choose CSV / JSON / Excel / RSS.&lt;/p&gt;

&lt;p&gt;Or if you prefer the API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://api.apify.com/v2/acts/kazkn~shopify-scraper-apps-spy/runs/last/dataset/items?format=csv&amp;amp;token=YOUR_TOKEN"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-o&lt;/span&gt; shopify-data.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What's in the output
&lt;/h2&gt;

&lt;p&gt;One record per product (or per variant if &lt;code&gt;include_variants: true&lt;/code&gt;). Each record carries the store-level apps stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"store_domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"allbirds.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"store_meta"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allbirds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"USD"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Wool Runner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_handle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mens-wool-runners"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vendor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allbirds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Sneakers"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"bestseller"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"wool"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;110&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"compare_at_price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"available"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"main_image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://cdn.shopify.com/..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"apps_detected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Klaviyo"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reviews"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Yotpo"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"subscriptions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"popups"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Klaviyo Forms"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Searchanise"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"loyalty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Smile.io"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://allbirds.com/products/mens-wool-runners"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scraped_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-05-02T14:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you set &lt;code&gt;extract_level: "full"&lt;/code&gt;, reviews come in a separate &lt;strong&gt;named dataset&lt;/strong&gt; called &lt;code&gt;reviews&lt;/code&gt;, with one row per review and a &lt;code&gt;product_handle&lt;/code&gt; foreign key to join back.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real use-case 1 — ICP qualification for B2B outbound
&lt;/h2&gt;

&lt;p&gt;Hypothesis: "Shopify stores running Klaviyo + a paid reviews app are good ICP for our retention SaaS — they spend money on retention tooling."&lt;/p&gt;

&lt;p&gt;Workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 1. Scrape your prospect list&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;store_urls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prospectList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// ~1,200 stores&lt;/span&gt;
  &lt;span class="na"&gt;extract_level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;max_products_per_store&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// 2. After the run, filter the dataset&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tier1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; 
  &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Klaviyo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reviews&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Yotpo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; 
   &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reviews&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Judge.me Premium&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cost: 1,200 stores × $0.005 = &lt;strong&gt;$6 total&lt;/strong&gt;. Time: ~25 minutes.&lt;/p&gt;

&lt;p&gt;For comparison, the cheapest SaaS that does this filter is &lt;strong&gt;$199/month&lt;/strong&gt; with monthly export caps.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real use-case 2 — Cold outbound personalization
&lt;/h2&gt;

&lt;p&gt;Open the email with the actual stack the prospect runs. From a real test on 200 accounts, this moves reply rate from ~4% to ~11%.&lt;/p&gt;

&lt;p&gt;Pre-call mail merge field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;opener&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reviewsApp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reviews&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;emailApp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewsApp&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Judge.me&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;emailApp&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Klaviyo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`Saw you're on Judge.me Free + Klaviyo — same combo we
            saw at [REFERENCE_BRAND] before they...`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewsApp&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Yotpo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;emailApp&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Klaviyo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`Noticed you're running Yotpo + Klaviyo — the data 
            integration there is usually the bottleneck...`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// ... 10-20 more conditions&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`Quick question about how you're handling [GENERIC_PROBLEM]...`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The conditional opener is what unlocks the reply rate. Generic openers stay around 3-5%.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real use-case 3 — App market-share research
&lt;/h2&gt;

&lt;p&gt;Scrape 5,000 stores in your vertical once a month. Aggregate the apps_detected fields. You'll have a real-time market-share dataset for any app category.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Aggregate email apps share over 5,000 stores&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;emailShare&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;apps_detected&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;emailShare&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;emailShare&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// emailShare = { Klaviyo: 2400, Mailchimp: 1100, Omnisend: 800, ... }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the kind of data BuiltWith charges $295/month for. With the actor + a one-line aggregation, you have it for $25 per refresh.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost math
&lt;/h2&gt;

&lt;p&gt;The pricing is pay-per-event — you pay only for the rows you get, not for compute time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;store_analyzed&lt;/code&gt; — $0.003 per store&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;product_extracted&lt;/code&gt; — $0.0005 per product&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;apps_detected&lt;/code&gt; — $0.001 per store at standard+&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;review_extracted&lt;/code&gt; — $0.0003 per review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Run&lt;/th&gt;
&lt;th&gt;Stores&lt;/th&gt;
&lt;th&gt;Products avg&lt;/th&gt;
&lt;th&gt;Reviews avg&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Small batch (Standard)&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;–&lt;/td&gt;
&lt;td&gt;$0.45&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium ICP scan (Standard)&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;–&lt;/td&gt;
&lt;td&gt;$9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full reviews pull (Full)&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;500&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;$30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly market research (Standard)&lt;/td&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;–&lt;/td&gt;
&lt;td&gt;$45&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The free $5 Apify credit covers ~1,500 stores at standard. You'd need to run several batches before paying anything.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common gotchas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. The store doesn't expose &lt;code&gt;/products.json&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
Rare, but some custom themes disable it. The actor logs a warning and falls back to scraping the sitemap.xml. Always check the run log for &lt;code&gt;404&lt;/code&gt; warnings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Detection misses one app.&lt;/strong&gt;&lt;br&gt;
About 70-80% of installed apps are detectable from the storefront HTML. Backend-only apps (accounting, inventory, shipping) don't load scripts — they're invisible to any scanner. If you spot a missing detector for a frontend-loading app, ping me on the actor's GitHub and I'll add it (~15-min job).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Rate-limit warnings on big batches.&lt;/strong&gt;&lt;br&gt;
At default concurrency (5 simultaneous requests), you should hit no limits. If you crank &lt;code&gt;max_concurrent_requests&lt;/code&gt; to 20 and hit 429s, the actor backs off automatically with jitter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Reviews on &lt;code&gt;extract_level: "full"&lt;/code&gt; blow the budget.&lt;/strong&gt;&lt;br&gt;
A 500-product store with 100 reviews each = 50,000 review rows = $15 alone. Use &lt;code&gt;max_reviews_per_product: 20&lt;/code&gt; to keep costs predictable.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is scraping &lt;code&gt;/products.json&lt;/code&gt; allowed?
&lt;/h3&gt;

&lt;p&gt;Shopify exposes &lt;code&gt;/products.json&lt;/code&gt; publicly on every store by default. The actor never authenticates, never bypasses access controls, and respects rate limits. For commercial use of scraped data, consult a lawyer in your jurisdiction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I get one record per variant instead of per product?
&lt;/h3&gt;

&lt;p&gt;Yes — set &lt;code&gt;include_variants: true&lt;/code&gt; in the input and the dataset returns one row per SKU with size/color/price/availability normalized.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does this work on &lt;code&gt;*.myshopify.com&lt;/code&gt; URLs?
&lt;/h3&gt;

&lt;p&gt;Yes. The actor canonicalizes URLs internally — &lt;code&gt;https://yourstore.myshopify.com&lt;/code&gt;, &lt;code&gt;https://yourstore.com&lt;/code&gt;, and &lt;code&gt;https://www.yourstore.com&lt;/code&gt; all route to the same scrape.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I integrate this into a Make.com or Zapier workflow?
&lt;/h3&gt;

&lt;p&gt;Apify has native Zapier integration — search "Apify" in Zapier triggers, choose "Run Actor", paste the input JSON. Make.com works the same way via the Apify HTTP module.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I run this on a schedule?
&lt;/h3&gt;

&lt;p&gt;Apify supports cron-style scheduling. Click &lt;strong&gt;Schedule&lt;/strong&gt; in the actor view, set the cadence (e.g., every Monday 8am), and the actor runs automatically with the same input.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrap up
&lt;/h2&gt;

&lt;p&gt;To recap the pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sign up on Apify, get $5 free credit.&lt;/li&gt;
&lt;li&gt;Open the &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy?fpr=8fp2od" rel="noopener noreferrer"&gt;actor&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Paste your store URLs, choose &lt;code&gt;standard&lt;/code&gt; extract level.&lt;/li&gt;
&lt;li&gt;Run it. Wait 25 minutes for 1,000 stores.&lt;/li&gt;
&lt;li&gt;Export the CSV / JSON / RSS / Excel.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total cost for 1,000 stores: &lt;strong&gt;about $9&lt;/strong&gt;. The cheapest SaaS alternative I tested for the same volume was $199/month.&lt;/p&gt;

&lt;p&gt;If a detector is missing, ping me — each is a 15-minute add. If you find a use-case I haven't documented, I'll add it here. The actor is on Apify Store: &lt;strong&gt;&lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy?fpr=8fp2od" rel="noopener noreferrer"&gt;kazkn/shopify-scraper-apps-spy&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Was this useful?&lt;/strong&gt; ❤️ a reaction or drop a comment with the use-case you're trying to solve — I read every reply and add detector + endpoint coverage based on what people actually need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow &lt;a href="https://dev.to/boo_n"&gt;@boo_n&lt;/a&gt;&lt;/strong&gt; for the next tutorials in this series: scraping reviews at scale, building a Shopify ICP dataset for cold outreach, and turning the actor into an MCP tool for Claude / Cursor.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tags: shopify, ecommerce, api, tutorial, javascript, webdev&lt;/em&gt;&lt;/p&gt;

</description>
      <category>shopify</category>
      <category>ecommerce</category>
      <category>api</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>My Indie SaaS Was Quietly Bleeding Users — The 4-Line Fix and Pricing Rebalance That Turned It Around</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Fri, 08 May 2026 22:11:17 +0000</pubDate>
      <link>https://dev.to/boo_n/my-indie-saas-was-quietly-bleeding-users-the-4-line-fix-and-pricing-rebalance-that-turned-it-4f0n</link>
      <guid>https://dev.to/boo_n/my-indie-saas-was-quietly-bleeding-users-the-4-line-fix-and-pricing-rebalance-that-turned-it-4f0n</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; &lt;em&gt;(May 2026)&lt;/em&gt; — My indie SaaS on Apify Store dropped from 51 lifetime users to 12 monthly active in 3 months. Diagnostic: a misleading "Succeeded" status was hiding zero-item runs from users. Fix: detect anti-bot challenges explicitly + fail loud + bind residential proxies to the URL country. Then I rebalanced pricing — small batches dropped 50%, large batches doubled. Same revenue projection, far less churn. Three engineering decisions, one product turnaround.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I shipped my first paid Apify actor in early 2026. By month three, the dashboard was telling two different stories. The success-rate graph said 91%. The MAU graph said I had lost 76% of my users — from 51 lifetime to 12 monthly active. Both were technically true. The gap between them is what this article is about.&lt;/p&gt;

&lt;p&gt;If you ship indie SaaS, sell scrapers, or run any product where customers can churn silently, the patterns here apply. The mechanics are specific to web scraping (Datadome, residential proxies, Apify's pay-per-event pricing), but the lesson — &lt;em&gt;silent success kills retention faster than loud failure&lt;/em&gt; — is universal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The metric I should have looked at sooner
&lt;/h2&gt;

&lt;p&gt;Apify Store displays success rate, total runs, and monthly users on every actor's public page. It does NOT display a "users who stopped coming back" metric. The actor I'd shipped showed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Total users&lt;/strong&gt;: 51&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monthly active&lt;/strong&gt;: 13&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Success rate (30 days)&lt;/strong&gt;: 91%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're a creator and your gut says "91% is good", same. The 91% was the surface lie. Underneath, the actual ratio was much worse.&lt;/p&gt;

&lt;p&gt;I dug into the run-level data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;11 ABORTED runs in 30 days — users hitting Cancel themselves because they saw nothing happening&lt;/li&gt;
&lt;li&gt;3 FAILED runs (visible failures with errors)&lt;/li&gt;
&lt;li&gt;143 SUCCEEDED runs — but a meaningful chunk of these returned &lt;strong&gt;zero items&lt;/strong&gt; in the dataset&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A success at the &lt;em&gt;process&lt;/em&gt; level (&lt;code&gt;exitCode 0&lt;/code&gt;, run finished cleanly) doesn't mean a success at the &lt;em&gt;product&lt;/em&gt; level (the user got data they paid for). Apify's status display only knows about the former.&lt;/p&gt;

&lt;p&gt;From the user's perspective: open the console, see ✅ Succeeded, click the dataset, see nothing. They don't file a bug. They just don't come back. There's no churn signal you can react to in time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug behind the silent success
&lt;/h2&gt;

&lt;p&gt;The actor scrapes &lt;a href="https://www.vinted.com" rel="noopener noreferrer"&gt;Vinted&lt;/a&gt;, the European secondhand marketplace. Vinted has no public API and is protected by Datadome, one of the more aggressive anti-bot layers on the web. The scraping pattern (which I'll detail below) involves a Playwright browser bootstrapping a session and a fast HTTP loop using the captured cookies.&lt;/p&gt;

&lt;p&gt;The bug, in plain English: when Datadome served a challenge page (its JavaScript proof-of-work + fingerprint check), my scraper waited 15 seconds for a catalog selector that would never appear, then logged a warning and &lt;em&gt;continued&lt;/em&gt; with whatever cookies it had collected. Those cookies were from the challenge state, not from a real authenticated session. The subsequent API call returned an empty array. The actor exited with &lt;code&gt;exitCode 0&lt;/code&gt;. Apify reported SUCCEEDED.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What I had — silently continues even when the challenge is unsolved&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.catalog-wrapper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Selector timeout — continuing with current cookies&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// ← bad assumption&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cookies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;  &lt;span class="c1"&gt;// challenge cookies, not auth&lt;/span&gt;
&lt;span class="c1"&gt;// ...later: API returns 0 items, run "succeeds"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The 4-line fix
&lt;/h2&gt;

&lt;p&gt;Three small additions made the bug observable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isDatadomeChallenge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;innerHTML&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;captcha-delivery.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
        &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dd_cookie_test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;access denied&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;isDatadomeChallenge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retire&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[Datadome] Challenge detected — retrying with new session.`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plus a final-state check at the end of the run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totalItems&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Zero items extracted. The Vinted page or filters may have returned no results, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;or anti-bot blocked all attempts. Verify the URL or try again.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The throw at the end converts a silent SUCCEEDED-with-empty-dataset into a loud FAILED with an actionable message. Customers who used to open an empty dataset and churn now see a clear error and either retry, fix their URL, or contact support. None of those outcomes are silent.&lt;/p&gt;

&lt;p&gt;The numbers after deployment: 0 ABORTED runs in the next 14 days. The implicit "kill my own run because nothing's happening" pattern disappeared.&lt;/p&gt;

&lt;h2&gt;
  
  
  The country-binding fix that 3 weeks of debugging didn't surface
&lt;/h2&gt;

&lt;p&gt;The other half of the retention drop was about non-French customers. My actor scraped Vinted reliably on &lt;code&gt;vinted.fr&lt;/code&gt; but had a much lower success rate on &lt;code&gt;vinted.de&lt;/code&gt;, &lt;code&gt;vinted.es&lt;/code&gt;, &lt;code&gt;vinted.it&lt;/code&gt;, etc. Customers in those markets churned hardest.&lt;/p&gt;

&lt;p&gt;Apify's residential proxy pool, by default, rotates across all available countries. So a German user's URL request might be served by a US-based residential IP. Vinted geo-routes when the IP doesn't match the domain — sometimes returning empty results, sometimes 403, sometimes a different country's inventory. Datadome flags the mismatched-IP pattern as suspicious and challenges harder.&lt;/p&gt;

&lt;p&gt;The fix is one config tweak: bind the proxy &lt;code&gt;countryCode&lt;/code&gt; to the URL's TLD before instantiating the crawler.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TLD_TO_COUNTRY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;fr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;de&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;es&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ES&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;it&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;nl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;pt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;PT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;be&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;BE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;cz&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CZ&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;sk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;hu&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HU&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ro&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RO&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;hr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;fi&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;se&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ee&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GR&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ie&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;IE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lu&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LU&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;lv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LV&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;si&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SI&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;co.uk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GB&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;TLD_TO_COUNTRY&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proxyConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createProxyConfiguration&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;useApifyProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;apifyProxyGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RESIDENTIAL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;countryCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After deploying, success rate on non-French markets went from ~60% to &amp;gt;95%. Same code, same actor, same Datadome — just one parameter that aligns the IP nationality with the URL's intended market.&lt;/p&gt;

&lt;p&gt;For customers who paste multiple URLs from different countries in one batch, the actor groups by country and runs a separate crawler per group, each with its own bound proxy. The customer pastes a flat list, the actor dispatches them transparently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pricing rebalance: 50% cheaper for small batches, 99% more for large
&lt;/h2&gt;

&lt;p&gt;After the reliability fixes, I pulled 90 days of run analytics and looked at the size distribution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;58%&lt;/strong&gt; of runs were under 25 items&lt;/li&gt;
&lt;li&gt;17% between 25 and 100 items&lt;/li&gt;
&lt;li&gt;25% over 100 items&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pricing model was: &lt;strong&gt;$0.30 per run start + $0.0015 per result&lt;/strong&gt;. For a 25-item run, the customer paid $0.34 — an effective rate of &lt;strong&gt;$13.60 per 1,000 items&lt;/strong&gt;, far above the market average ($0.50–$3.50 per 1,000). Customers tried once, saw the receipt, never came back. That start fee was the silent killer of small-batch use cases (monitoring, alerts, exploratory scraping).&lt;/p&gt;

&lt;p&gt;The new pricing: &lt;strong&gt;$0.04 per GB of memory at start (= $0.08 in 2 GB) + $0.0035 per result&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Old price&lt;/th&gt;
&lt;th&gt;New price&lt;/th&gt;
&lt;th&gt;Δ for customer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;25 items&lt;/td&gt;
&lt;td&gt;$0.34&lt;/td&gt;
&lt;td&gt;$0.17&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-50%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100 items&lt;/td&gt;
&lt;td&gt;$0.45&lt;/td&gt;
&lt;td&gt;$0.43&lt;/td&gt;
&lt;td&gt;-4%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;200 items&lt;/td&gt;
&lt;td&gt;$0.60&lt;/td&gt;
&lt;td&gt;$0.78&lt;/td&gt;
&lt;td&gt;+30%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$1.80&lt;/td&gt;
&lt;td&gt;$3.58&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+99%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Modeled on 90 days of historical run data, total revenue projection is roughly the same. But the distribution shifted toward customer fairness: small monitoring runs are cheap enough not to trigger sticker shock; bulk extractions pay fairly for the value delivered.&lt;/p&gt;

&lt;p&gt;Existing users keep the old pricing for 14 days (Apify's pricing-schedule policy automatically notifies them by email). New customers see the new pricing immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architectural pattern that made all this possible
&lt;/h2&gt;

&lt;p&gt;The actor uses what I'd call &lt;strong&gt;asymmetric scraping&lt;/strong&gt;, since the term doesn't seem to have a canonical name yet:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;One Playwright browser&lt;/strong&gt; opens the catalog page on a residential IP. Datadome runs its JS challenge against a real Chromium environment with a coherent fingerprint. Cookies (&lt;code&gt;datadome&lt;/code&gt;, &lt;code&gt;dd_cookie_test&lt;/code&gt;, plus Vinted's &lt;code&gt;_vinted_*_session&lt;/code&gt;) are deposited in the page context.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reuse those cookies in a fast HTTP loop&lt;/strong&gt; — &lt;code&gt;got-scraping&lt;/code&gt; for Node, &lt;code&gt;requests&lt;/code&gt; with custom headers for Python. Hit Vinted's internal &lt;code&gt;/api/v2/catalog/items&lt;/code&gt; endpoint directly, paginated. ~10× faster than driving the browser for every page request.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;On 401/403/429&lt;/strong&gt;: drop the session, regenerate via Playwright with a fresh residential IP, resume the loop where it left off.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The browser does the unlock. The HTTP client does the volume. Throughput goes from ~50 items/min for pure-browser scraping to ~500 items/min in this hybrid mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three lessons I keep relearning as an indie shipper
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Silent success &amp;gt; loud failure for retention.&lt;/strong&gt; A run that returns zero items should &lt;em&gt;fail loud&lt;/em&gt;, not succeed quietly. Status displays based on &lt;code&gt;exitCode&lt;/code&gt; lie about product-level outcomes. Always assert at the end of every workflow that the user got what they paid for, and crash if not.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Country-bind your residential proxies for any geo-routed product.&lt;/strong&gt; Datadome, Cloudflare, Akamai — all of them flag the mismatched-IP pattern. Two lines of config (&lt;code&gt;countryCode&lt;/code&gt;) are worth a 35-percentage-point swing in success rate on non-default markets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;When 60% of customers churn quietly, look at your fixed fees, not your per-unit price.&lt;/strong&gt; The headline rate ($1.50/1k) was reasonable. The hidden $0.30 minimum was lethal for the 58% of runs that were small. Always price for the smallest unit your customer actually wants to buy, not your average ARPU.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Where the fix lives
&lt;/h2&gt;

&lt;p&gt;The actor is the &lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;Vinted Turbo Scraper&lt;/a&gt; on Apify Store. Open-source integration examples (curl, Node, Python, batch multi-country, scheduling) are at &lt;a href="https://github.com/Boo-n/vinted-turbo-scraper" rel="noopener noreferrer"&gt;github.com/Boo-n/vinted-turbo-scraper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Apify's free plan includes $5/month of credits, enough to run a few hundred scrapes before you commit to anything.&lt;/p&gt;

&lt;p&gt;If you ship anything similar, drop a comment with your retention/pricing tradeoffs — would love to compare notes.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Last verified: May 2026.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>saas</category>
      <category>productivity</category>
      <category>webscraping</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>The Best Vinted Scraper in 2026 — Honest Comparison of 8 Tools (Tested)</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Wed, 06 May 2026 15:26:04 +0000</pubDate>
      <link>https://dev.to/boo_n/the-best-vinted-scraper-in-2026-honest-comparison-of-8-tools-tested-5ej8</link>
      <guid>https://dev.to/boo_n/the-best-vinted-scraper-in-2026-honest-comparison-of-8-tools-tested-5ej8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR — Best Vinted scraper depends on your use case&lt;/strong&gt; &lt;em&gt;(May 2026, all data verified on Apify Store)&lt;/em&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;th&gt;Pick&lt;/th&gt;
&lt;th&gt;Pricing&lt;/th&gt;
&lt;th&gt;Success rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Paste URL → JSON pipeline&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Vinted Turbo Scraper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.08 + $3.50/1k&lt;/td&gt;
&lt;td&gt;95 %+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-country price arbitrage&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Vinted Smart Scraper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.50/1k flat&lt;/td&gt;
&lt;td&gt;98.4 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Continuous monitoring + alerts&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;epicscrapers' Monitor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.001/result + $0.00083/check&lt;/td&gt;
&lt;td&gt;99.5 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&amp;lt;100 listings/week (free)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;shahidirfan's Scraper&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;~80 %&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI agent (Claude/Cursor)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Vinted MCP Server&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$0&lt;/td&gt;
&lt;td&gt;98.7 %&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Skip if success rate &amp;lt; 90 % — Datadome will eat your runs.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;I'm one of the developers behind two of the scrapers on this list (Vinted Turbo Scraper and Vinted Smart Scraper, both on the &lt;a href="https://apify.com/kazkn" rel="noopener noreferrer"&gt;&lt;code&gt;kazkn&lt;/code&gt; Apify profile&lt;/a&gt;). So this isn't neutral — but I run my actors' analytics weekly &lt;em&gt;and&lt;/em&gt; my competitors' public Apify Store stats, which gives me unusually good visibility into how each tool actually performs in production.&lt;/p&gt;

&lt;p&gt;If you're choosing a Vinted scraper in 2026 and don't want to waste $20 testing them all, this comparison gives you the answer in under 5 minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a Vinted scraper exists in the first place
&lt;/h2&gt;

&lt;p&gt;Vinted is the largest secondhand fashion marketplace in Europe — 100M+ users, ~25 country-specific domains (vinted.fr, vinted.de, vinted.co.uk, vinted.es, vinted.it, vinted.pl, etc.). It does not publish a public API. The website's internal &lt;code&gt;/api/v2/catalog/items&lt;/code&gt; endpoint is protected by &lt;a href="https://datadome.co" rel="noopener noreferrer"&gt;Datadome&lt;/a&gt;, one of the more aggressive anti-bot layers on the web.&lt;/p&gt;

&lt;p&gt;This makes scraping non-trivial. Resellers, market researchers, price-tracking startups, and AI agents all need Vinted catalog data — but the technical barrier (Datadome bypass + residential proxies + country binding) is enough that most people pay for a managed scraper rather than build their own.&lt;/p&gt;

&lt;p&gt;That's the market this comparison covers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology
&lt;/h2&gt;

&lt;p&gt;For each tool below, I checked:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Public success rate&lt;/strong&gt; (Apify Store displays &lt;code&gt;succeededRuns / totalRuns&lt;/code&gt; for the last 30 days on every actor page)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pricing model and effective cost per 1k listings&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Country coverage&lt;/strong&gt; (which Vinted TLDs are supported reliably)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; (items/min based on my own test runs and public stats)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What it's actually good at&lt;/strong&gt; (architecture, output schema, reliability tradeoffs)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All numbers below were captured between &lt;strong&gt;May 2 and May 6, 2026&lt;/strong&gt;. Apify Store updates these stats every 24h, so they're as current as it gets.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. ⚡ Vinted Turbo Scraper &lt;em&gt;(my own — full disclosure)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-turbo-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;apify.com/kazkn/vinted-turbo-scraper&lt;/a&gt; · &lt;a href="https://github.com/Boo-n/vinted-turbo-scraper" rel="noopener noreferrer"&gt;GitHub examples&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.04/GB Actor Start + $0.0035/result&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective cost (100 items, 2 GB)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.43&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective cost (1,000 items, 2 GB)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$3.58&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;95 %+&lt;/strong&gt; (post v0.0.89 country-binding rewrite)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;~500 items/min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Country coverage&lt;/td&gt;
&lt;td&gt;All 26 Vinted TLDs (auto-detected and country-bound proxy)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;Structured JSON (id, title, price, brand, size, condition, photos, seller, location, scrapedAt)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it does well&lt;/strong&gt;: turns a Vinted search URL into structured JSON as fast as possible. The whole input is just &lt;code&gt;startUrls&lt;/code&gt; (newline-separated, batch-supported), &lt;code&gt;maxItems&lt;/code&gt;, and proxy config. No filter form to fill in. URL-native.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's special&lt;/strong&gt; (and what fixed my retention 2 months ago): country-aware proxy binding. &lt;code&gt;vinted.fr&lt;/code&gt; URLs get FR residential IPs, &lt;code&gt;vinted.de&lt;/code&gt; URLs get DE IPs, etc. Without this, a French IP scraping vinted.de gets geo-redirected and returns garbage. This single config tweak took my non-FR success rate from ~60 % to &amp;gt;95 %.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What it doesn't do&lt;/strong&gt;: no seller analysis, no sold-item lookup, no cross-country comparison. For that, see Vinted Smart Scraper below.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: you want the fastest setup from filtered Vinted search → exported listings, you batch URLs across multiple countries, you need it reliable on non-FR markets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it free&lt;/strong&gt;: Apify Free plan includes $5/month of platform credits — enough for ~1,400 listings.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. 🧠 Vinted Smart Scraper — Cross-Country Price Comparison &lt;em&gt;(my own)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-smart-scraper?fpr=8fp2od" rel="noopener noreferrer"&gt;apify.com/kazkn/vinted-smart-scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.0005/result flat ($0.50 / 1k)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective cost (1,000 items)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.50&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;98.4 %&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Country coverage&lt;/td&gt;
&lt;td&gt;19 Vinted TLDs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Cross-country arbitrage research, sold-item analysis, seller deep-dive&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why it's cheaper than Turbo per result&lt;/strong&gt;: it's a research-tier actor optimized for volume. You'd typically run it once per day for arbitrage research, not poll it every 15 minutes for monitoring.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Killer feature&lt;/strong&gt;: pulls listings from multiple country domains in parallel and computes the spread per item. The same Nike Dunk Low can sit at €180 in France, €145 in Germany, and €220 in Italy — that spread is where reseller arbitrage happens. It also does seller analysis (number of listings, response rate, ratings) and sold-item lookup (extremely rare in the Vinted scraper space).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: you're a reseller doing cross-country arbitrage, or building a Vinted price intelligence dashboard.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Vinted Scraper + Monitor &lt;em&gt;(epicscrapers)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/epicscrapers/vinted-search-scraper" rel="noopener noreferrer"&gt;apify.com/epicscrapers/vinted-search-scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.001/result + $0.00083/monitor check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;99.5 %&lt;/strong&gt; &lt;em&gt;(highest in this comparison)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Continuous monitoring with native alert primitives&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it's good at&lt;/strong&gt;: built-in alert plumbing. You configure it to ping every X seconds and trigger webhook actions when new listings match your filter. Good fit if you want monitoring without wiring up your own scheduler + webhook.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tradeoff&lt;/strong&gt;: more expensive at scale than running Apify Scheduler + Vinted Turbo Scraper yourself. Math: monitoring every 15 min on 25 items = (96 runs × $0.08) + (96 × 25 × $0.0035) = &lt;strong&gt;$15.84/day&lt;/strong&gt; with Turbo. Same setup with epicscrapers' monitor = (96 × $0.00083 × 96 polls/day) + (25 × $0.001 × 96) = &lt;strong&gt;$10.05/day&lt;/strong&gt;. So at high frequency, epicscrapers wins. At low frequency (hourly polls), Turbo wins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: you want plug-and-play monitoring without writing scheduler config.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. shahidirfan's Vinted Scraper &lt;em&gt;(free)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/shahidirfan/Vinted-Scraper" rel="noopener noreferrer"&gt;apify.com/shahidirfan/Vinted-Scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;FREE&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Effective cost&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Hobbyist, occasional small-volume runs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it's good at&lt;/strong&gt;: literally free. If you're scraping &amp;lt;100 listings a week for a personal project, this is hard to beat on cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tradeoff&lt;/strong&gt;: limited support, no country-bound proxy logic (fails often on non-FR markets), may break when Vinted updates anti-bot tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: you're a hobbyist, you only scrape vinted.fr, and your runs are infrequent enough that occasional failures don't break your workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. 🤖 Vinted MCP Server &lt;em&gt;(my own — for AI agents)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/kazkn/vinted-mcp-server?fpr=8fp2od" rel="noopener noreferrer"&gt;apify.com/kazkn/vinted-mcp-server&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.00001/result (effectively free)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;For&lt;/td&gt;
&lt;td&gt;Claude / Cursor / Windsurf AI agents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it does&lt;/strong&gt;: exposes Vinted data as Model Context Protocol tools so an AI agent can call "search Vinted for X" and get structured results. Built on Apify's MCP server primitive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: you're building an AI agent (Claude Project, Cursor extension, Windsurf workflow) that needs Vinted data live, not at scrape-time.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. automation-lab's Vinted Scraper &lt;em&gt;(skip if you can)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/automation-lab/vinted-scraper" rel="noopener noreferrer"&gt;apify.com/automation-lab/vinted-scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.0027/result + $0.00475 start&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;56 %&lt;/strong&gt; &lt;em&gt;(low — known Datadome issues)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it's good at&lt;/strong&gt;: 16 markets supported on paper.&lt;br&gt;
&lt;strong&gt;Tradeoff&lt;/strong&gt;: 56 % success rate. Almost half your runs fail. At 100 items/run, you'd average 56 successful items per attempt with full charge for failures — that's effectively $4.85 / 1k usable items vs the advertised $2.70. Look at the public stats page before committing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: skip until they fix the proxy story. The pricing looks competitive but the success rate makes it a worse deal than Turbo or Smart.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. piotrv1001's Vinted Listings Scraper
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/piotrv1001/vinted-listings-scraper" rel="noopener noreferrer"&gt;apify.com/piotrv1001/vinted-listings-scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.001/result + $0.004 per seller detail (optional)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;100 %&lt;/strong&gt; &lt;em&gt;(but small sample — verify on their page)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What it's good at&lt;/strong&gt;: granular pricing — you only pay for seller detail fetches if you need them. Good for resellers who only need listings 90 % of the time and seller data 10 % of the time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pick this if&lt;/strong&gt;: per-seller granularity matters and you're cost-sensitive on listings-only volume.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. saswave's Vinted Product &amp;amp; Profile Scraper &lt;em&gt;(skip)&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://apify.com/saswave/vinted-product-item-profile-scraper" rel="noopener noreferrer"&gt;apify.com/saswave/vinted-product-item-profile-scraper&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;$0.001/result&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success rate (30d)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;48.8 %&lt;/strong&gt; &lt;em&gt;(unstable)&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The actor description says "$0.40/1000 results, no proxies" — but the 48 % success rate is exactly &lt;em&gt;because&lt;/em&gt; there are no residential proxies. Without them, Datadome blocks half of the runs. Skip until proxies are added.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to choose in 60 seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Are you scraping &amp;lt;100 listings/week for personal use?
└── YES → shahidirfan (free)
└── NO ↓

Do you need cross-country price comparison or arbitrage?
└── YES → Vinted Smart Scraper ($0.50/1k)
└── NO ↓

Are you building an AI agent (Claude / Cursor / Windsurf)?
└── YES → Vinted MCP Server (free tier)
└── NO ↓

Do you need monitoring with built-in alerts?
└── YES → epicscrapers' Monitor
└── NO → Vinted Turbo Scraper (paste URL, get JSON, done)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What to ask before paying for any Vinted scraper
&lt;/h2&gt;

&lt;p&gt;Four signals to check on the actor's Apify Store page &lt;strong&gt;before&lt;/strong&gt; running anything:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Public success rate?&lt;/strong&gt; Every Apify Store actor page shows it. &lt;strong&gt;Anything below 90 % means failed runs you'll still get charged for&lt;/strong&gt; (in some cases — depends on the pricing model). For Vinted specifically, &amp;lt;90 % is almost always a Datadome/proxy issue, and it won't get better without dev investment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Country-bound proxies?&lt;/strong&gt; This is the #1 reliability fix in the Vinted scraper space in 2026. If the actor doesn't bind proxy &lt;code&gt;countryCode&lt;/code&gt; to the URL TLD, you'll see empty results on non-FR markets. Test with a vinted.de URL before committing to volume.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing transparency?&lt;/strong&gt; Look for a clear pricing tab and a &lt;code&gt;maxTotalChargeUsd&lt;/code&gt; cap to prevent runaway runs from blowing your budget.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pagination metadata robustness?&lt;/strong&gt; Vinted has changed pagination response shape twice in 2026. If the actor breaks every time Vinted shifts a field, you'll get partial datasets silently.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Vinted scraping legal in 2026?
&lt;/h3&gt;

&lt;p&gt;Public catalog data (titles, prices, photos shown to anonymous browsers) sits in a grey area in most jurisdictions. The widely cited &lt;em&gt;hiQ Labs v. LinkedIn&lt;/em&gt; (US, 2022) and the EU's Database Directive both lean permissive for non-personal data scraped from publicly accessible pages. &lt;strong&gt;Personal seller data is different&lt;/strong&gt; — it's protected under GDPR and you should not redistribute it without a lawful basis. Don't scrape what you can't see logged out, don't redistribute personal data, respect &lt;a href="https://www.vinted.com/terms_and_conditions" rel="noopener noreferrer"&gt;Vinted's terms&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Vinted have a public API?
&lt;/h3&gt;

&lt;p&gt;No. Vinted does not publish a public API for third-party developers as of May 2026. The internal &lt;code&gt;/api/v2/catalog/items&lt;/code&gt; endpoint exists and is what scraper actors hit, but it's not documented and Vinted can change its shape at any time.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much does it cost to scrape 1,000 Vinted listings?
&lt;/h3&gt;

&lt;p&gt;Depends on the tool. &lt;strong&gt;Cheapest&lt;/strong&gt;: shahidirfan's free actor ($0). &lt;strong&gt;Cheapest paid&lt;/strong&gt;: Vinted Smart Scraper at $0.50. &lt;strong&gt;Mid-tier&lt;/strong&gt;: Vinted Turbo Scraper at $3.58 (with $0.08 fixed start fee included). &lt;strong&gt;Most expensive in this comparison&lt;/strong&gt;: alkausari_mujahid at $10/1k.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will scraping Vinted ban my account?
&lt;/h3&gt;

&lt;p&gt;Authenticated scraping (using your Vinted login cookies) carries account-ban risk. Anonymous scraping (no Vinted login) carries IP-ban risk on the proxy, not your account. All actors in this comparison use anonymous scraping with rotating residential proxies — no Vinted login required from you, and your account is not exposed.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's the fastest Vinted scraper in 2026?
&lt;/h3&gt;

&lt;p&gt;Vinted Turbo Scraper at ~500 items/min on a single residential session. Vinted Smart Scraper hits ~400 items/min in cross-country mode (parallelized but with country-binding overhead). epicscrapers' Monitor is rate-limited by design (you set the polling interval, not the throughput).&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I scrape multiple Vinted countries in one run?
&lt;/h3&gt;

&lt;p&gt;Yes, but only with actors that have country-binding logic. Vinted Turbo Scraper, Vinted Smart Scraper, and (partially) epicscrapers' Monitor support multi-country batch input. Other actors in this comparison default to a single country per run, which means you'd run separate jobs per country and merge the data yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Difference between Vinted Turbo Scraper and Vinted Smart Scraper?
&lt;/h3&gt;

&lt;p&gt;Turbo is &lt;strong&gt;fast and URL-native&lt;/strong&gt; ($0.08 + $3.50/1k). Paste a search URL, get JSON in 30-60 seconds. For monitoring, alerts, simple pipelines.&lt;/p&gt;

&lt;p&gt;Smart is &lt;strong&gt;research-grade&lt;/strong&gt; ($0.50/1k flat). Cross-country comparison, seller analysis, sold-item lookup. For arbitrage research, price intelligence dashboards, academic studies.&lt;/p&gt;

&lt;p&gt;Different tools for different intents. Both ship from the same dev team with the same anti-bot stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bottom line
&lt;/h2&gt;

&lt;p&gt;Most Vinted scrapers fail for the same reason: they don't country-bind their residential proxies, so non-FR markets return empty results and Datadome flags the IP pool. The 3 scrapers I'd actually use in 2026 are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vinted Turbo Scraper&lt;/strong&gt; — for paste-URL → JSON workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vinted Smart Scraper&lt;/strong&gt; — for cross-country research&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;shahidirfan's free actor&lt;/strong&gt; — for occasional hobby runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apify's Free plan gives you $5/month of credits, enough to test all three before committing.&lt;/p&gt;

&lt;p&gt;If you'd like the full architectural writeup of how Vinted Turbo Scraper bypasses Datadome (asymmetric scraping, country-bound proxies, fail-loud retention pattern), I documented it on &lt;a href="https://github.com/Boo-n/vinted-turbo-scraper" rel="noopener noreferrer"&gt;my GitHub repo&lt;/a&gt; with curl, Node, and Python integration examples.&lt;/p&gt;

&lt;p&gt;Questions, corrections, or use cases I missed? Drop them in the comments — I update this guide monthly.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Last verified: May 6, 2026&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>vinted</category>
      <category>ecommerce</category>
      <category>automation</category>
    </item>
    <item>
      <title>I scraped 1,200 Shopify stores to qualify B2B leads — here's what I learned about ICP</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Tue, 05 May 2026 12:50:27 +0000</pubDate>
      <link>https://dev.to/boo_n/i-scraped-1200-shopify-stores-to-qualify-b2b-leads-heres-what-i-learned-about-icp-12ag</link>
      <guid>https://dev.to/boo_n/i-scraped-1200-shopify-stores-to-qualify-b2b-leads-heres-what-i-learned-about-icp-12ag</guid>
      <description>&lt;p&gt;&lt;em&gt;A 6-week experiment in turning competitor research into a $0.005-per-store API.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/jxpSVYvZBFw" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fekj5k0a1cby3fu6h6f4l.jpg" alt="Watch the 2-minute walkthrough" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;▶️ &lt;em&gt;2-minute video walkthrough of the actor in action — input, run, dataset, API call.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;There is a quiet rule in B2B sales nobody puts on a slide: &lt;strong&gt;the cost of qualifying a bad lead is roughly equal to the cost of closing a good one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Six weeks ago I was trying to validate ICP for a B2B side-project targeting Shopify operators. The hypothesis was tight: "Stores running Klaviyo plus a paid reviews app spend money on retention tooling, so they will pay for ours."&lt;/p&gt;

&lt;p&gt;To test it I needed two columns next to each store name: &lt;strong&gt;email provider&lt;/strong&gt; and &lt;strong&gt;reviews provider&lt;/strong&gt;. Maybe a third for &lt;strong&gt;subscriptions&lt;/strong&gt;. From those three I could segment 1,200 stores into a tier-one list of about 200, and avoid wasting outreach on the rest.&lt;/p&gt;

&lt;p&gt;The data exists. It is in the page source of every Shopify store. Apps inject &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags from their own CDN — &lt;code&gt;cdn.judge.me&lt;/code&gt;, &lt;code&gt;cdn.yotpo.com&lt;/code&gt;, &lt;code&gt;loox.io/widget&lt;/code&gt;, &lt;code&gt;klaviyo.com/onsite&lt;/code&gt;. Any store using Klaviyo loads a Klaviyo script. The information was right there.&lt;/p&gt;

&lt;p&gt;But three minutes of View Source per store, times 1,200 stores, is 60 hours. I do not have 60 hours.&lt;/p&gt;

&lt;p&gt;So I did the thing I had been avoiding for months: I wrote the scraper.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I expected to find
&lt;/h2&gt;

&lt;p&gt;I expected the scraping itself to be the hard part. It was not.&lt;/p&gt;

&lt;p&gt;I expected proxies, retries, and rate-limit roulette. None of it materialized — &lt;code&gt;/products.json&lt;/code&gt; is publicly served on every Shopify store and the homepage HTML is, well, a homepage. No bot challenges, no CAPTCHA, no reCAPTCHA. A polite concurrency limit of 5 simultaneous requests is enough to scan 1,000 stores in 25 minutes without anyone noticing.&lt;/p&gt;

&lt;p&gt;What I did not expect was how much &lt;strong&gt;the app stack tells you about the operator&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A Shopify store on Judge.me Free is a different company than a Shopify store on Yotpo Premium. Same revenue band, same vertical, same product type — totally different stage, budget, and pain points.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Judge.me Free → indie operator, doing under $30k/month, allergic to monthly subscriptions, will not buy your $99/month tool unless you frame it as ROI within 30 days.&lt;/li&gt;
&lt;li&gt;Yotpo Premium → seven-figure DTC brand, has a marketing team, will compare you against 4 competitors in a vendor matrix before signing, and will negotiate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can replicate this exercise across every app category:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Klaviyo Free → still building list, every dollar matters.&lt;/li&gt;
&lt;li&gt;Klaviyo paid (&amp;gt;$100/mo) → mature email program, ready for sophisticated tooling.&lt;/li&gt;
&lt;li&gt;Postscript → SMS-first DTC, modern stack, probably trying everything.&lt;/li&gt;
&lt;li&gt;Mailchimp → legacy stack, conservative, harder to displace.&lt;/li&gt;
&lt;li&gt;ReCharge → subscription-driven economics, focus on retention.&lt;/li&gt;
&lt;li&gt;Smile.io → loyalty-conscious, willing to invest in retention tooling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Six weeks ago I would have called this overthinking. Today I run my outbound off it. Reply rate moved from 4% to 11% on a 200-account test, simply by changing the opener line to acknowledge the actual stack the prospect was running.&lt;/p&gt;




&lt;h2&gt;
  
  
  The build, in three observations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Observation 1 — Most "Shopify scrapers" are not.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every existing tool I tested fell into one of two camps. Either it scraped products only (no app detection), or it detected apps but lookup-by-lookup through a Chrome extension. Nothing did both, in batch, at low volume cost.&lt;/p&gt;

&lt;p&gt;The closest matches were paid SaaS dashboards (Storeleads, Charm, BuiltWith) at $99 to $499 per month, with monthly export caps. For a one-off list of 1,200 stores I could not justify a $1,200 yearly subscription that I would forget to use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observation 2 — App detection is a 10-line regex problem, not an ML problem.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every Shopify app I needed to detect ships through one of three patterns: a &lt;code&gt;&amp;lt;script src="cdn.[appname].com/..."&amp;gt;&lt;/code&gt; tag in the homepage HTML, a &lt;code&gt;&amp;lt;meta name="generator"&amp;gt;&lt;/code&gt; tag with the app name, or an inline &lt;code&gt;_q.push(...)&lt;/code&gt; queue call. Match on any of those three, OR them together, return a boolean.&lt;/p&gt;

&lt;p&gt;The whole detection module — covering Klaviyo, Yotpo, Judge.me, Loox, Stamped, Reviews.io, ReCharge, Bold, Skio, Privy, Justuno, Mailchimp, Postscript, Attentive, Smile.io, Searchanise, Boost, and 8 others — is about 600 lines of JavaScript including snapshot tests. New detectors are 15-minute additions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observation 3 — Pay-per-event pricing changes the unit economics.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Apify's Store lets developers price by the row. So a 500-product store with full app detection plus reviews costs about $0.30 to scan. A thousand-store batch costs about $3.&lt;/p&gt;

&lt;p&gt;That number matters. At $3 per batch, refreshing my ICP list weekly is a coffee. At $300 per batch (which is what specialized SaaS would charge for the same volume), I would refresh quarterly and miss every interesting signal in between.&lt;/p&gt;

&lt;p&gt;The cheaper the unit, the higher the refresh frequency. The higher the refresh frequency, the better the signal quality. This is true for every kind of competitive intelligence work, and it's the reason I shipped the actor as a public tool instead of keeping it private.&lt;/p&gt;




&lt;h2&gt;
  
  
  What surprised me
&lt;/h2&gt;

&lt;p&gt;Three things I did not expect, in order of how badly I underestimated them:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;/products.json&lt;/code&gt; is more honest than the storefront.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Shopify's catalog endpoint exposes products that have been unpublished from the theme but are still live in the database — out-of-stock items, B2B-only SKUs, retired collections that nobody bothered to fully delete. For research, this is gold. You see what the merchant sells today and what they sold last quarter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Reviews-app detection turned out to be the strongest lead signal.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;More predictive than email provider, more predictive than vertical, more predictive than location. A store paying for reviews is a store that has scaled past the early stage and is now optimizing for retention and social proof. That's where my offer lands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. People want this packaged as an MCP tool for Claude.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two of the first three external users asked. I had not planned for it. The pattern is clear though — once you can pipe Shopify-store data into Claude or Cursor and ask "qualify these 200 stores for my ICP", you stop opening spreadsheets. I am building it next.&lt;/p&gt;




&lt;h2&gt;
  
  
  The actor, if you want to use it
&lt;/h2&gt;

&lt;p&gt;I shipped the scraper on Apify Store as &lt;strong&gt;&lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Shopify Apps Spy + Product Scraper&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What it does in one call:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pulls the &lt;strong&gt;full product catalog&lt;/strong&gt; for a list of Shopify URLs (titles, prices, variants, images, vendor, tags).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detects installed apps&lt;/strong&gt; across email/SMS, reviews, subscriptions, popups, search, loyalty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pulls reviews&lt;/strong&gt; when a reviews app is detected, by routing to that app's public reviews API.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$0.005 per store for the standard tier (products + apps).&lt;/li&gt;
&lt;li&gt;$0.30 for a 500-product store with full reviews.&lt;/li&gt;
&lt;li&gt;Apify gives a $5 free credit on signup, which covers about 1,500 stores.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it doesn't do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Historical data. If you need "who started using Klaviyo in Q1 2024," you want BuiltWith.&lt;/li&gt;
&lt;li&gt;Cross-platform. Shopify only. WooCommerce/Magento are different problems.&lt;/li&gt;
&lt;li&gt;Filtering by revenue band. Storeleads does that better.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're an indie founder, agency analyst, or sales rep doing 100-2,000 stores per month and you need raw exports, it should fit. If your volume is much larger or much smaller, the SaaS competitors are probably the right call.&lt;/p&gt;




&lt;h2&gt;
  
  
  The takeaway, if you skim
&lt;/h2&gt;

&lt;p&gt;Three things I would do differently if I were starting over:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Build the qualification tool before you start prospecting, not after&lt;/strong&gt;. The 60-hour manual baseline is what kills the experiment. Every B2B founder I have asked has the same story.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat tech-stack data as ICP data, not technographic trivia&lt;/strong&gt;. The app a store runs is downstream of their stage, budget, and team size. Use it that way.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refresh weekly, not quarterly&lt;/strong&gt;. Cheap refresh frequency beats expensive depth nine times out of ten in early-stage outbound.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The scraper is on &lt;strong&gt;&lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Apify Store&lt;/a&gt;&lt;/strong&gt;, free $5 credit covers your first batch. If a detector is missing, ping me — each is a 15-minute add.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How do I detect what apps a Shopify store is using?&lt;/strong&gt;&lt;br&gt;
Apps inject identifiable scripts into the storefront HTML — &lt;code&gt;cdn.judge.me&lt;/code&gt;, &lt;code&gt;cdn.yotpo.com&lt;/code&gt;, &lt;code&gt;klaviyo.com/onsite&lt;/code&gt;, etc. Either inspect the page source manually (3 minutes per store) or use &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Shopify Apps Spy + Product Scraper&lt;/a&gt; to detect 150+ apps in batch at $0.005 per store.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Shopify legal?&lt;/strong&gt;&lt;br&gt;
Yes for publicly accessible product data. Shopify exposes &lt;code&gt;/products.json&lt;/code&gt; on every storefront and the homepage HTML is public. No login, no API key, no proxy needed for most stores. You're reading what the merchant chose to publish.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How long does it take to scan 1,000 Shopify stores?&lt;/strong&gt;&lt;br&gt;
About 25 minutes at a polite concurrency of 5 simultaneous requests. The bottleneck is &lt;code&gt;/products.json&lt;/code&gt; response size, not rate limits — Shopify storefronts handle this volume without complaint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's a realistic cost for B2B lead qualification across 1,200 Shopify stores?&lt;/strong&gt;&lt;br&gt;
Around $3 of compute on Apify's pay-per-event pricing — $0.005 per store for products + apps detection, $0.30 for full reviews. The $5 free Apify credit covers your first ~1,500 stores.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which Shopify apps are the strongest signal of B2B SaaS fit?&lt;/strong&gt;&lt;br&gt;
Reviews providers (Yotpo Premium, Okendo, Stamped Pro) signal seven-figure DTC. Klaviyo paid plans signal a mature email program. Postscript or Attentive signal SMS-first modern stack. Smile.io signals retention-conscious operators ready to invest in tooling.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tags: shopify, b2b, lead-generation, ecommerce, indiehackers&lt;/em&gt;&lt;/p&gt;

</description>
      <category>shopify</category>
      <category>ecommerce</category>
      <category>b2b</category>
      <category>indiehackers</category>
    </item>
    <item>
      <title>I built a Shopify scraper that detects apps + pulls products in one API call</title>
      <dc:creator>KazKN</dc:creator>
      <pubDate>Sat, 02 May 2026 11:50:01 +0000</pubDate>
      <link>https://dev.to/boo_n/i-built-a-shopify-scraper-that-detects-apps-pulls-products-in-one-api-call-5a8b</link>
      <guid>https://dev.to/boo_n/i-built-a-shopify-scraper-that-detects-apps-pulls-products-in-one-api-call-5a8b</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Existing Shopify app detectors (Koala Inspector, ShopScan, Fera, BuiltWith) are Chrome extensions or SaaS dashboards. None do batch. I had 1,200 stores to qualify and View Source + Cmd-F was killing my afternoons, so I shipped an Apify actor that takes a list of Shopify URLs and returns the full app stack (Klaviyo, Yotpo, Judge.me, Loox, ReCharge…) + product catalog + reviews in JSON. No headless browser, ~$0.005 per store, 1,000 stores in 25 minutes. Live here → &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Shopify Scraper – Apps Spy + Reviews&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The afternoon that broke me
&lt;/h2&gt;

&lt;p&gt;Six weeks ago I was prospecting for a B2B side-project. The hypothesis: "Shopify stores running Klaviyo + a paid reviews app are the right ICP — they spend money on retention tooling, so they will pay for ours."&lt;/p&gt;

&lt;p&gt;To validate, I needed a list of Shopify stores &lt;strong&gt;and&lt;/strong&gt; their installed apps.&lt;/p&gt;

&lt;p&gt;The Shopify App Store does not give you that. The "stores using X" databases do, but the public ones are stale and the good ones are paid SaaS at $99–499/month for filters I did not need.&lt;/p&gt;

&lt;p&gt;So I did what every founder does at 11 PM: I opened View Source on a competitor list, hit &lt;code&gt;Cmd-F&lt;/code&gt;, and started typing &lt;code&gt;klaviyo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It worked. Sort of. I did 40 stores in two hours, then stopped, because I had a list of 1,200.&lt;/p&gt;

&lt;p&gt;That night I wrote the first version of what is now &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Shopify Scraper – Apps Spy + Reviews&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually wanted
&lt;/h2&gt;

&lt;p&gt;Every "Shopify scraper" I found online did one of two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Scraped a single store's products via &lt;code&gt;/products.json&lt;/code&gt; — table-stakes, dozens of free Apify actors do it.&lt;/li&gt;
&lt;li&gt;Spawned a headless browser to fingerprint a marketing site — slow, expensive, and brittle.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I wanted three things in one pass:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full &lt;strong&gt;product catalog&lt;/strong&gt; (titles, prices, variants, images, vendor, tags) — nothing exotic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;App detection&lt;/strong&gt;: which third-party Shopify apps are installed (email, reviews, subscriptions, popups, search).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviews&lt;/strong&gt; when a reviews app is detected — pull them via the public API, not by parsing widgets.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I wanted it to be &lt;strong&gt;cheap&lt;/strong&gt;, because I had ~1,200 stores in my first batch and I planned to run it monthly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "no headless browser" decision
&lt;/h2&gt;

&lt;p&gt;The thing nobody tells you about Shopify scraping is that you almost never need a headless browser. The signals you want for app detection live in three places, and all three are reachable with a plain HTTPS GET:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The HTML of the homepage&lt;/strong&gt;. Shopify apps inject &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags from their own CDN. &lt;code&gt;cdn.judge.me&lt;/code&gt;, &lt;code&gt;cdn.yotpo.com&lt;/code&gt;, &lt;code&gt;loox.io/widget&lt;/code&gt;, &lt;code&gt;klaviyo.com/onsite&lt;/code&gt; — you grep the HTML and you know.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/products.json&lt;/code&gt;&lt;/strong&gt;. Shopify exposes the full catalog at this path on every store, paginated 250 items at a time. No auth, no headless. (You hit a soft rate limit around 2 req/s per IP, which is fine if you queue politely.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;App-specific public endpoints&lt;/strong&gt;. Judge.me has a JSON reviews endpoint. Yotpo too. Same for Loox, Stamped, Reviews.io. Once you know which app is installed, you go straight to its API — no DOM parsing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole actor is built around that observation. No Puppeteer, no Playwright, no proxy farm. Just &lt;code&gt;got-scraping&lt;/code&gt;, &lt;code&gt;cheerio&lt;/code&gt;, and &lt;code&gt;p-queue&lt;/code&gt; to keep concurrency civilized.&lt;/p&gt;

&lt;p&gt;The result is that scanning a single store costs ~3–6 HTTPS requests and runs in &lt;strong&gt;2 to 8 seconds&lt;/strong&gt; depending on catalog size. Cost on Apify infra: about $0.005 per store for the "tech stack only" mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture (it is small on purpose)
&lt;/h2&gt;

&lt;p&gt;I'll be honest — I almost over-engineered this. My first draft had Redis for de-dup, a queue, retry logic with exponential backoff, and a state machine. Then I deleted all of it.&lt;/p&gt;

&lt;p&gt;Here is what shipped:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;src/
├── main.js                   # orchestration (p-queue, per-store flow)
├── crawlers/
│   ├── products.js           # /products.json + sitemap fallback
│   ├── apps.js               # detect apps from homepage HTML
│   └── reviews.js            # per-app reviews fetchers
└── lib/
    ├── normalize.js          # canonicalize URLs, normalize product schema
    ├── schemas.js            # zod validation for outputs
    └── billing.js            # Apify pay-per-event charges
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A run goes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Canonicalize the store URL (handles &lt;code&gt;www&lt;/code&gt;, custom domains, &lt;code&gt;*.myshopify.com&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Fetch the homepage once. Confirm it is Shopify (the &lt;code&gt;x-shopify-stage&lt;/code&gt; header is a giveaway).&lt;/li&gt;
&lt;li&gt;From the same HTML, run the app detectors. Each detector is ~10 lines of regex matching against script tags + meta tags + inline JSON.&lt;/li&gt;
&lt;li&gt;Fetch &lt;code&gt;/products.json?page=N&lt;/code&gt; until you hit the cap or run out of products.&lt;/li&gt;
&lt;li&gt;If the user asked for reviews and an installed reviews app was detected, fan out to that app's public reviews API.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is it. The whole thing is ~900 lines of JavaScript. I run it with &lt;code&gt;node --test&lt;/code&gt; for unit tests against snapshots and a &lt;code&gt;tests/smoke-products.mjs&lt;/code&gt; that hits 5 real stores end-to-end.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned about app detection
&lt;/h2&gt;

&lt;p&gt;The regex-against-HTML approach has one trap. Shopify themes minify, version, and CDN-rewrite their assets, so you cannot match on a single string. The Klaviyo loader, for example, ships under at least four URL patterns I have seen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;static.klaviyo.com/onsite/js/klaviyo.js&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;static-tracking.klaviyo.com/onsite/js/...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;a.klaviyo.com/media/...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;inline &lt;code&gt;_learnq&lt;/code&gt; queue calls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You match &lt;strong&gt;any&lt;/strong&gt; of those, and you call it Klaviyo. Same logic for every other app — every detector is an array of patterns, OR'd together, returning a single boolean. I wrote a snapshot test per app with a real store HTML page so a Klaviyo URL change does not silently break detection.&lt;/p&gt;

&lt;p&gt;The detectors I shipped on day one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Email/SMS&lt;/strong&gt;: Klaviyo, Omnisend, Postscript, Mailchimp, Attentive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviews&lt;/strong&gt;: Yotpo, Judge.me, Loox, Stamped, Reviews.io, Okendo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subscriptions&lt;/strong&gt;: ReCharge, Bold, Loop, Skio&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Popups &amp;amp; SMS capture&lt;/strong&gt;: Privy, Justuno, Klaviyo Forms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search &amp;amp; discovery&lt;/strong&gt;: Searchanise, Boost, Algolia&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Loyalty&lt;/strong&gt;: Smile.io, Yotpo Loyalty, LoyaltyLion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you tell me an app I missed, I add a detector. Each one is a 15-minute job.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pay-per-event pricing problem
&lt;/h2&gt;

&lt;p&gt;Apify lets you charge per event instead of per compute minute. For a scraper that runs in seconds, this is the right model — your customer pays for the rows they get, not for compute time.&lt;/p&gt;

&lt;p&gt;The mistake I made on my first push was leaving Apify's default &lt;code&gt;dataset_item&lt;/code&gt; event on. Combined with my custom &lt;code&gt;product_extracted&lt;/code&gt; event, every product was being charged twice. I caught it in monetization review and removed the synthetic event.&lt;/p&gt;

&lt;p&gt;The pricing I landed on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;store_analyzed&lt;/code&gt; — $0.003 per store (covers detection + products fetch)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;product_extracted&lt;/code&gt; — $0.0005 per product&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;apps_detected&lt;/code&gt; — $0.001 per store at standard+&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;review_extracted&lt;/code&gt; — $0.0003 per review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A 500-product store with reviews costs roughly &lt;strong&gt;$0.30&lt;/strong&gt; end to end. For comparison, the SaaS competitors charge $99 or more for similar lookups, batched and capped.&lt;/p&gt;

&lt;h2&gt;
  
  
  What surprised me
&lt;/h2&gt;

&lt;p&gt;Three things, in order of how badly I underestimated them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;/products.json&lt;/code&gt; is more honest than the storefront.&lt;/strong&gt; It exposes products that are unpublished from the theme but still live (out-of-stock holdovers, B2B-only SKUs). Useful for trend research. Sometimes shocking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Reviews-app detection is a lead signal.&lt;/strong&gt; A store on Judge.me Free plan vs. Yotpo Premium tells you a lot about their stage. I ended up using this internally to prioritize cold outbound — different pitch for a $30/month stack vs. a $1,200/month stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. People want this as an MCP server.&lt;/strong&gt; Two of my first three users asked if they could query it from Claude / ChatGPT. I have it on the roadmap. (My &lt;a href="https://apify.com/kazkn/gpt-crawler-mcp" rel="noopener noreferrer"&gt;GPT Crawler MCP&lt;/a&gt; and &lt;a href="https://apify.com/kazkn/vinted-mcp-server" rel="noopener noreferrer"&gt;Vinted MCP Server&lt;/a&gt; are the two MCP actors I shipped first; the Shopify one is next.)&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use it in one minute
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// On Apify, paste this in the actor input box&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;store_urls&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://allbirds.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://gymshark.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;extract_level&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;standard&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// products + apps stack&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;max_products_per_store&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;250&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output (one record per product, with &lt;code&gt;apps_detected&lt;/code&gt; attached):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"store_domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"allbirds.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Wool Runner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;110&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"available"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vendor"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allbirds"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"apps_detected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Klaviyo"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reviews"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Yotpo"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"subscriptions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Searchanise"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://allbirds.com/products/mens-wool-runners"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want reviews, set &lt;code&gt;extract_level: "full"&lt;/code&gt; and a &lt;code&gt;max_reviews_per_product&lt;/code&gt;. The actor will route to the correct reviews API based on what was detected.&lt;/p&gt;

&lt;p&gt;Direct link, free $5 credit on Apify, no account-creation drama: &lt;strong&gt;&lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;Shopify Scraper – Apps Spy + Reviews&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is scraping &lt;code&gt;/products.json&lt;/code&gt; allowed?
&lt;/h3&gt;

&lt;p&gt;Shopify exposes &lt;code&gt;/products.json&lt;/code&gt; publicly on every store by default. Stores that disable it (rare) return 404 and the actor logs a skip. The actor never authenticates, never bypasses access controls, and respects standard rate limits.&lt;/p&gt;

&lt;h3&gt;
  
  
  What about reCAPTCHA or Cloudflare?
&lt;/h3&gt;

&lt;p&gt;For the standard catalog and app-detection flow, no. &lt;code&gt;/products.json&lt;/code&gt; and the homepage HTML are not gated. For some reviews APIs, very high request volumes can trigger rate-limits — the actor backs off and retries with jitter.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is this different from Koala Inspector, ShopScan or BuiltWith?
&lt;/h3&gt;

&lt;p&gt;Koala Inspector, ShopScan and Fera are excellent Chrome extensions for one-store lookups, but none of them do batch — you cannot paste 500 URLs and get a CSV back. BuiltWith is a generic tech-stack tool with broad coverage but its Shopify-app detection is shallow and you cannot pull products in the same call. This actor is purpose-built for Shopify and runs in batch via API: deeper app detection (subscriptions, reviews, popups, search, loyalty), full product catalog, and reviews — all in one pass, billed pay-per-event.&lt;/p&gt;

&lt;h3&gt;
  
  
  How long does a 1,000-store scan take?
&lt;/h3&gt;

&lt;p&gt;About 25 minutes at default concurrency, costing ~$3 of Apify credits at the &lt;code&gt;standard&lt;/code&gt; level. A &lt;code&gt;full&lt;/code&gt; run with reviews is closer to an hour and ~$15 depending on review volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I get one record per variant instead of per product?
&lt;/h3&gt;

&lt;p&gt;Yes. Set &lt;code&gt;include_variants: true&lt;/code&gt; in the input and the dataset returns one row per SKU with size/color/price/availability normalized.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is next
&lt;/h2&gt;

&lt;p&gt;I want to add three things, in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Revenue estimation&lt;/strong&gt; at the &lt;code&gt;pro&lt;/code&gt; tier — based on review velocity and product velocity, both of which are observable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server mode&lt;/strong&gt; so you can query it from Claude desktop / Cursor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Theme detection&lt;/strong&gt; — useful for agency outbound, less useful for me, but I keep being asked.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you use it and something breaks, ping me — I am the only maintainer and I read every issue. The actor is on Apify Store at &lt;a href="https://apify.com/kazkn/shopify-scraper-apps-spy" rel="noopener noreferrer"&gt;kazkn/shopify-scraper-apps-spy&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tags: #shopify #ecommerce #api #indiehackers&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Was this useful?&lt;/strong&gt; ❤️ a reaction or drop a comment with the use-case you're trying to solve — I read every reply and add detector + endpoint coverage based on what people actually need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Follow &lt;a href="https://dev.to/boo_n"&gt;@boo_n&lt;/a&gt;&lt;/strong&gt; for hands-on tutorials: scraping reviews at scale, ICP qualification at $0.005 per store, and turning the actor into an MCP tool for Claude / Cursor.&lt;/p&gt;

</description>
      <category>shopify</category>
      <category>ecommerce</category>
      <category>api</category>
      <category>indiehackers</category>
    </item>
  </channel>
</rss>
