<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: root</title>
    <description>The latest articles on DEV Community by root (@fistonuser).</description>
    <link>https://dev.to/fistonuser</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2893790%2Fdbaff6a5-4463-41c6-952b-5ba7f4d9346a.jpeg</url>
      <title>DEV Community: root</title>
      <link>https://dev.to/fistonuser</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/fistonuser"/>
    <language>en</language>
    <item>
      <title>I Built a URL Metadata API After Wasting Days on Manual Scraping</title>
      <dc:creator>root</dc:creator>
      <pubDate>Mon, 10 Nov 2025 16:58:31 +0000</pubDate>
      <link>https://dev.to/fistonuser/i-built-a-url-metadata-api-after-wasting-days-on-manual-scraping-3m81</link>
      <guid>https://dev.to/fistonuser/i-built-a-url-metadata-api-after-wasting-days-on-manual-scraping-3m81</guid>
      <description>&lt;h2&gt;
  
  
  The Problem That Wouldn't Go Away
&lt;/h2&gt;

&lt;p&gt;I was building a bookmark manager (because apparently that's what developers do when they can't find the perfect one). Everything was going smoothly until I hit the "add bookmark" feature.&lt;/p&gt;

&lt;p&gt;Users paste a URL, and I needed to show them a nice preview card with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Title&lt;/li&gt;
&lt;li&gt;Description
&lt;/li&gt;
&lt;li&gt;Image&lt;/li&gt;
&lt;li&gt;Favicon&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Simple, right? &lt;strong&gt;Wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  First Attempt: The BeautifulSoup Nightmare
&lt;/h2&gt;

&lt;p&gt;Started with Python and BeautifulSoup. Wrote some code to fetch HTML and parse meta tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;html.parser&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;og:title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;og:description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;og:image&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Issues I immediately ran into:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;JavaScript-rendered sites&lt;/strong&gt; - Half the modern web doesn't have meta tags in the initial HTML. They're added client-side.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inconsistent meta tag formats&lt;/strong&gt; - Some sites use &lt;code&gt;og:title&lt;/code&gt;, others use &lt;code&gt;twitter:title&lt;/code&gt;, some just use &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt;. You need fallback logic for everything.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bot detection&lt;/strong&gt; - Cloudflare and other services block obvious scrapers. My IP got banned testing on my own URLs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Timeouts&lt;/strong&gt; - Some sites take 10+ seconds to respond. Had to implement aggressive timeouts and retry logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Malformed HTML&lt;/strong&gt; - So many sites have broken HTML that crashes parsers.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After three days of fighting edge cases, I gave up on the DIY approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Second Attempt: Existing APIs
&lt;/h2&gt;

&lt;p&gt;Found a few metadata extraction APIs. The cheapest one was $49/month for 10,000 requests. For a side project that &lt;em&gt;might&lt;/em&gt; get 100 users? No thanks.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Build My Own
&lt;/h2&gt;

&lt;p&gt;I realized this problem isn't going away. Every developer building bookmarks, chat apps, or content aggregators hits this wall.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Scrapix&lt;/strong&gt; - a production-ready metadata extraction API that handles all the annoying edge cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Simple API Call
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://scrapix-api.com/extract&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-RapidAPI-Key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;your-api-key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://example.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Response Format
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.to"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DEV Community"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Where programmers share ideas and help each other grow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.to/social-preview.png"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"favicon"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://dev.to/favicon.ico"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"site_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DEV Community"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean, consistent, predictable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World React Example
&lt;/h2&gt;

&lt;p&gt;Here's how I use it in my bookmark manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;BookmarkForm&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setUrl&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMetadata&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fetchMetadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://scrapix-api.com/extract&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-RapidAPI-Key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;RAPIDAPI_KEY&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nf"&gt;setMetadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nf"&gt;setLoading&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;input&lt;/span&gt; 
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; 
        &lt;span class="na"&gt;onChange&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Paste URL..."&lt;/span&gt;
      &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;fetchMetadata&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;loading&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Loading...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Fetch Preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"preview-card"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;img&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;alt&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;h3&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;p&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Under the Hood: How I Solved the Hard Problems
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. JavaScript-Rendered Content
&lt;/h3&gt;

&lt;p&gt;Used headless browsers (Puppeteer) for sites that require client-side rendering. The API detects whether a page needs JS execution and routes accordingly.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Smart Caching
&lt;/h3&gt;

&lt;p&gt;5-minute cache on all metadata requests. Most link previews don't change that often, so this cuts redundant requests by ~70%.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Multiple Fallback Strategies
&lt;/h3&gt;

&lt;p&gt;The parser tries in order:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenGraph tags (&lt;code&gt;og:*&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Twitter Cards (&lt;code&gt;twitter:*&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;Standard meta tags&lt;/li&gt;
&lt;li&gt;HTML tags (&lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;h1&amp;gt;&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;URL parsing (domain name as fallback)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Global Performance
&lt;/h3&gt;

&lt;p&gt;Deployed on Vercel's edge network. Average response time: 800ms. Cached responses: &amp;lt;100ms.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Batch Processing
&lt;/h3&gt;

&lt;p&gt;Need metadata for 10 URLs at once? One API call handles it.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Should You Use This vs DIY?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Build it yourself if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have one specific site to scrape&lt;/li&gt;
&lt;li&gt;You need very custom data extraction&lt;/li&gt;
&lt;li&gt;You're building a scraping tool as the core product&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use an API if:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need it to work across any website&lt;/li&gt;
&lt;li&gt;You don't want to maintain parsing logic&lt;/li&gt;
&lt;li&gt;Your time is better spent on your actual product&lt;/li&gt;
&lt;li&gt;You need reliable uptime and performance&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Common Use Cases
&lt;/h2&gt;

&lt;p&gt;I've seen developers use Scrapix for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bookmark managers&lt;/strong&gt; - Auto-fill card previews&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chat apps&lt;/strong&gt; - Rich link previews (like Slack/Discord)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content aggregators&lt;/strong&gt; - Pull article metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Social schedulers&lt;/strong&gt; - Preview posts before publishing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CMS platforms&lt;/strong&gt; - Auto-populate article details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RSS readers&lt;/strong&gt; - Enhance feed items with images&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Performance Matters
&lt;/h2&gt;

&lt;p&gt;For my bookmark app, metadata extraction went from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DIY approach&lt;/strong&gt;: 3-15 seconds (with failures)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With Scrapix&lt;/strong&gt;: 800ms average, 100ms cached&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the difference between users waiting and users not noticing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing That Makes Sense
&lt;/h2&gt;

&lt;p&gt;Free tier: 100 requests/month (perfect for testing)&lt;br&gt;&lt;br&gt;
Basic: 1,000 requests/month&lt;br&gt;&lt;br&gt;
Pro: 10,000 requests/month&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Check RapidAPI for current pricing)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The API is live on RapidAPI: &lt;a href="https://rapidapi.com/fistonturner/api/scrapix" rel="noopener noreferrer"&gt;Scrapix API&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Free tier means you can test it with zero risk. If it doesn't solve your problem, you've lost nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned Building This
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Edge cases are infinite&lt;/strong&gt; - Every site has its own quirks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance beats features&lt;/strong&gt; - Developers want fast, reliable, simple&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Good APIs solve one problem perfectly&lt;/strong&gt; - Don't try to do everything&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching is magic&lt;/strong&gt; - 5-minute TTL made costs drop dramatically&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;If you're building anything that needs link previews, don't waste three days like I did. Your time is worth more than that.&lt;/p&gt;

&lt;p&gt;Focus on what makes your product unique. Let specialized tools handle the boring infrastructure stuff.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Have you built something similar? What challenges did you face?&lt;/strong&gt; Drop a comment below - always curious to hear other developers' experiences with web scraping and metadata extraction.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;P.S. - If you found this helpful, I'd love to hear what you're building! Connect with me on [&lt;a href="https://github.com/fiston-user" rel="noopener noreferrer"&gt;https://github.com/fiston-user&lt;/a&gt;]&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>python</category>
    </item>
    <item>
      <title>support images</title>
      <dc:creator>root</dc:creator>
      <pubDate>Wed, 26 Feb 2025 21:47:50 +0000</pubDate>
      <link>https://dev.to/fistonuser/support-images-379e</link>
      <guid>https://dev.to/fistonuser/support-images-379e</guid>
      <description>&lt;p&gt;support images&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Futfs.io%2Ff%2FvEzJn4Lw5NpAt914S2OnGDTB3s1vWMYOp5PC2UxXaJlAdiER" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Futfs.io%2Ff%2FvEzJn4Lw5NpAt914S2OnGDTB3s1vWMYOp5PC2UxXaJlAdiER" alt="Video" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>slsl</title>
      <dc:creator>root</dc:creator>
      <pubDate>Wed, 26 Feb 2025 21:46:29 +0000</pubDate>
      <link>https://dev.to/fistonuser/slsl-kcd</link>
      <guid>https://dev.to/fistonuser/slsl-kcd</guid>
      <description>&lt;p&gt;slsl&lt;/p&gt;

</description>
    </item>
    <item>
      <title>kqkl</title>
      <dc:creator>root</dc:creator>
      <pubDate>Wed, 26 Feb 2025 21:46:10 +0000</pubDate>
      <link>https://dev.to/fistonuser/kqkl-285k</link>
      <guid>https://dev.to/fistonuser/kqkl-285k</guid>
      <description>&lt;p&gt;kqkl&lt;/p&gt;

</description>
    </item>
    <item>
      <title>post on all textxs platforms</title>
      <dc:creator>root</dc:creator>
      <pubDate>Wed, 26 Feb 2025 21:44:12 +0000</pubDate>
      <link>https://dev.to/fistonuser/post-on-all-textxs-platforms-ni8</link>
      <guid>https://dev.to/fistonuser/post-on-all-textxs-platforms-ni8</guid>
      <description>&lt;p&gt;post on all textxs platforms&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>reddit tell me</title>
      <dc:creator>root</dc:creator>
      <pubDate>Tue, 25 Feb 2025 16:59:00 +0000</pubDate>
      <link>https://dev.to/fistonuser/reddit-tell-me-20dj</link>
      <guid>https://dev.to/fistonuser/reddit-tell-me-20dj</guid>
      <description>&lt;p&gt;reddit tell me&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>yh</title>
      <dc:creator>root</dc:creator>
      <pubDate>Mon, 24 Feb 2025 21:55:23 +0000</pubDate>
      <link>https://dev.to/fistonuser/yh-3bk8</link>
      <guid>https://dev.to/fistonuser/yh-3bk8</guid>
      <description>&lt;p&gt;yh&lt;/p&gt;

</description>
    </item>
    <item>
      <title>test</title>
      <dc:creator>root</dc:creator>
      <pubDate>Mon, 24 Feb 2025 21:52:56 +0000</pubDate>
      <link>https://dev.to/fistonuser/test-34j7</link>
      <guid>https://dev.to/fistonuser/test-34j7</guid>
      <description>&lt;p&gt;test&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>test</title>
      <dc:creator>root</dc:creator>
      <pubDate>Mon, 24 Feb 2025 16:30:51 +0000</pubDate>
      <link>https://dev.to/fistonuser/test-2d91</link>
      <guid>https://dev.to/fistonuser/test-2d91</guid>
      <description>&lt;p&gt;test&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>qoqo</title>
      <dc:creator>root</dc:creator>
      <pubDate>Mon, 24 Feb 2025 16:29:54 +0000</pubDate>
      <link>https://dev.to/fistonuser/qoqo-2eb9</link>
      <guid>https://dev.to/fistonuser/qoqo-2eb9</guid>
      <description>&lt;p&gt;qoqo&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Futfs.io%2Ff%2FvEzJn4Lw5NpAE09ZBqkUR5n7uM1jL2eQVlKYFwm9NfBaJbhA" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Futfs.io%2Ff%2FvEzJn4Lw5NpAE09ZBqkUR5n7uM1jL2eQVlKYFwm9NfBaJbhA" alt="Image" width="768" height="768"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>emptystring</category>
    </item>
    <item>
      <title>testing</title>
      <dc:creator>root</dc:creator>
      <pubDate>Sat, 22 Feb 2025 16:42:27 +0000</pubDate>
      <link>https://dev.to/fistonuser/testing-2il2</link>
      <guid>https://dev.to/fistonuser/testing-2il2</guid>
      <description>&lt;p&gt;testing&lt;/p&gt;

</description>
      <category>testing</category>
    </item>
    <item>
      <title>test schduled</title>
      <dc:creator>root</dc:creator>
      <pubDate>Sat, 22 Feb 2025 13:25:00 +0000</pubDate>
      <link>https://dev.to/fistonuser/test-schduled-3m5n</link>
      <guid>https://dev.to/fistonuser/test-schduled-3m5n</guid>
      <description>&lt;p&gt;test schduled&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
