<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: kjyoun-ai</title>
    <description>The latest articles on DEV Community by kjyoun-ai (@kjyounai).</description>
    <link>https://dev.to/kjyounai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3869384%2Fdc7b19a7-6b5b-4fdf-b741-4b1c8321a992.png</url>
      <title>DEV Community: kjyoun-ai</title>
      <link>https://dev.to/kjyounai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kjyounai"/>
    <language>en</language>
    <item>
      <title>Marker, hosted: a scientific PDF parser API with LaTeX equations preserved</title>
      <dc:creator>kjyoun-ai</dc:creator>
      <pubDate>Thu, 09 Apr 2026 09:05:27 +0000</pubDate>
      <link>https://dev.to/kjyounai/marker-hosted-a-scientific-pdf-parser-api-with-latex-equations-preserved-5df8</link>
      <guid>https://dev.to/kjyounai/marker-hosted-a-scientific-pdf-parser-api-with-latex-equations-preserved-5df8</guid>
      <description>&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;I kept hitting the same wall when building RAG pipelines over research papers: every generic PDF parser I tried mangled the equations.&lt;/p&gt;

&lt;p&gt;Adobe Extract, AWS Textract, pdfplumber, PyMuPDF — they all collapse display math into plain-text garbage. &lt;code&gt;Attention(Q,K,V) = softmax(QKT / √dk) V&lt;/code&gt; becomes something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;QKT √dk

Attention(Q,K,V ) = softmax(

)V (1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unusable. Your embedding model sees a soup of tokens. Your LLM has no idea what the equation means. Your RAG answers are wrong on anything math-heavy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I tried
&lt;/h2&gt;

&lt;p&gt;I benchmarked the obvious options on a handful of arxiv papers I cared about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docling&lt;/strong&gt; (IBM): drops every display equation as a placeholder. ~5/12 on a controlled equation-extraction benchmark.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nougat&lt;/strong&gt; (Meta): the results were actually good when it worked, but the repo is essentially unmaintained and the dependency tree is a minefield.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mistral OCR&lt;/strong&gt;: cheap and general-purpose, but equation fidelity is inconsistent on papers with dense notation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LlamaParse&lt;/strong&gt;: optimized for "give me RAG chunks", not "preserve the math".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Marker&lt;/strong&gt; (github.com/datalab-to/marker): the only OSS tool that consistently produced clean LaTeX. Scored ~10.5/12 on the same benchmark Docling scored 5 on.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why I didn't just use Marker directly
&lt;/h2&gt;

&lt;p&gt;Marker is the right tool, but running it yourself is not trivial:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;5GB of model weights to download on first run&lt;/li&gt;
&lt;li&gt;CUDA + PyTorch + transformers + torchvision version dance&lt;/li&gt;
&lt;li&gt;GPU server to host it (T4 or better — CPU inference takes ~10x longer)&lt;/li&gt;
&lt;li&gt;A queue because parses take 60–180 seconds and you can't block an HTTP request that long&lt;/li&gt;
&lt;li&gt;Idle GPU bills when nobody is parsing anything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a side project, this was 2+ days of yak shaving before I could POST my first PDF. I wanted a one-line API.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;I wrapped Marker in a Modal deployment and put an async FastAPI on top of it. Two endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Submit a paper&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://scientific-paper-parser1.p.rapidapi.com/parse-paper &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-RapidAPI-Key: &lt;/span&gt;&lt;span class="nv"&gt;$KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"url=https://arxiv.org/pdf/1706.03762"&lt;/span&gt;
&lt;span class="c"&gt;# → {"call_id": "fc-01K...", "status": "queued"}&lt;/span&gt;

&lt;span class="c"&gt;# Poll for the result&lt;/span&gt;
curl https://scientific-paper-parser1.p.rapidapi.com/parse-paper/&lt;span class="nv"&gt;$ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"X-RapidAPI-Key: &lt;/span&gt;&lt;span class="nv"&gt;$KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="c"&gt;# → {"status": "done", "result": {"markdown": "...", ...}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And on the same Attention paper, it returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tex"&gt;&lt;code&gt;&lt;span class="p"&gt;$$&lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;Attention&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Q, K, V&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt; &lt;/span&gt;&lt;span class="nv"&gt;\text&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="nv"&gt;\left&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;\frac&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;QK&lt;/span&gt;&lt;span class="p"&gt;^&lt;/span&gt;&lt;span class="nb"&gt;T&lt;/span&gt;&lt;span class="p"&gt;}{&lt;/span&gt;&lt;span class="nv"&gt;\sqrt&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;d&lt;/span&gt;&lt;span class="p"&gt;_&lt;/span&gt;&lt;span class="nb"&gt;k&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="nv"&gt;\right&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt; V&lt;/span&gt;&lt;span class="p"&gt;$$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean. LaTeX. Embeds cleanly into any RAG pipeline. Renders in any markdown viewer with math support. Feeds Claude and GPT cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Modal architecture
&lt;/h2&gt;

&lt;p&gt;Three things made the economics work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Persistent volume for model weights.&lt;/strong&gt; First container ever downloads the 5GB of Marker weights to a Modal Volume. Every subsequent container mounts the volume and reuses them. Cold start on a warm volume is ~10 seconds instead of ~5 minutes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;models_volume&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;modal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Volume&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;marker-models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create_if_missing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.cls&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;volumes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/root/.cache/datalab&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;models_volume&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;gpu&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;T4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;scaledown_window&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Parser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nd"&gt;@modal.enter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_models&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;marker.converters.pdf&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PdfConverter&lt;/span&gt;
        &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;marker.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_model_dict&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;converter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PdfConverter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;artifact_dict&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;create_model_dict&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. spawn-and-poll for long parses.&lt;/strong&gt; A 50-page paper takes 90–180 seconds. You can't hold an HTTP connection open that long, especially not behind a CDN. Modal's &lt;code&gt;function.spawn()&lt;/code&gt; returns a &lt;code&gt;FunctionCall&lt;/code&gt; object you can look up by ID later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@api.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/parse-paper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UploadFile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Form&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="n"&gt;pdf_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;_fetch_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Parser&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queued&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@api.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/parse-paper/{call_id}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;modal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FunctionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;done&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Scale-to-zero.&lt;/strong&gt; &lt;code&gt;scaledown_window=300&lt;/code&gt; keeps a warm container for 5 minutes after the last request. After that, the container shuts down and idle cost drops to zero. I pay only for the seconds I'm actually parsing something.&lt;/p&gt;

&lt;h2&gt;
  
  
  The business side
&lt;/h2&gt;

&lt;p&gt;I put it behind a RapidAPI listing so distribution is one click for anyone comparing parsing APIs. Free tier is 2 papers/month (no credit card) and paid plans start at $9/mo for 75 papers.&lt;/p&gt;

&lt;p&gt;I'm not trying to beat Marker on quality (it IS Marker). I'm not trying to beat Mistral OCR on price (I can't). I'm solving one specific problem: &lt;strong&gt;"I want Marker quality without running a GPU server."&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Honest about what this is not
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Not my model.&lt;/strong&gt; It's Marker (Apache 2.0), hosted. I'm explicit about this on the landing page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not the cheapest per-page option.&lt;/strong&gt; Mistral OCR is cheaper if you don't care about equation fidelity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not for scanned PDFs.&lt;/strong&gt; Typeset only — Marker doesn't do OCR.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not for arxiv-only workflows.&lt;/strong&gt; There's a free tool called arxiv2md that parses arxiv's HTML source if that's all you need.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where this fits
&lt;/h2&gt;

&lt;p&gt;If you're doing RAG over biorxiv, chemrxiv, published journal PDFs, internal research docs, or any scientific PDF that isn't on arxiv, and equation fidelity matters for your answers, this saves you a weekend.&lt;/p&gt;

&lt;p&gt;Landing: &lt;a href="https://paper-parser-landing.vercel.app" rel="noopener noreferrer"&gt;https://paper-parser-landing.vercel.app&lt;/a&gt;&lt;br&gt;
API: &lt;a href="https://rapidapi.com/kjyounai/api/scientific-paper-parser1" rel="noopener noreferrer"&gt;https://rapidapi.com/kjyounai/api/scientific-paper-parser1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback welcome — especially if you've tried self-hosting Marker before or have opinions on the async polling pattern. Happy to answer questions about the Modal setup or the Marker tradeoffs in the comments.&lt;/p&gt;

</description>
      <category>api</category>
      <category>llm</category>
      <category>rag</category>
      <category>tooling</category>
    </item>
  </channel>
</rss>
