<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: saripalli shanmukha kiran sagar</title>
    <description>The latest articles on DEV Community by saripalli shanmukha kiran sagar (@theoxfaber).</description>
    <link>https://dev.to/theoxfaber</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3923809%2Fca44db75-000f-48ea-b0f3-abe3c333f358.jpeg</url>
      <title>DEV Community: saripalli shanmukha kiran sagar</title>
      <link>https://dev.to/theoxfaber</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/theoxfaber"/>
    <language>en</language>
    <item>
      <title>I built a Rust LLM inference engine with custom WGSL GPU kernels, here's what I learned!</title>
      <dc:creator>saripalli shanmukha kiran sagar</dc:creator>
      <pubDate>Sat, 30 May 2026 07:33:21 +0000</pubDate>
      <link>https://dev.to/theoxfaber/i-built-a-rust-llm-inference-engine-with-custom-wgsl-gpu-kernels-heres-what-i-learned-37gc</link>
      <guid>https://dev.to/theoxfaber/i-built-a-rust-llm-inference-engine-with-custom-wgsl-gpu-kernels-heres-what-i-learned-37gc</guid>
      <description>&lt;p&gt;I've been working on a side project called aether , a Rust LLM inference engine that can load GGUF models and run them with WGPU GPU acceleration.&lt;/p&gt;

&lt;p&gt;It started as a way to understand how LLMs actually work under the hood. One thing led to another, and now it has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Loads GGUF models (Llama/Mistral/Phi/Qwen)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;WGPU GPU backend (Metal/Vulkan/DX12)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Custom fused WGSL compute shaders for Q8_0 and Q4_K quantized matmul (dequantize inline instead of a separate pass)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Concurrent request pool for serving multiple users&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;OpenAI-compatible API server (axum)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Pure Rust, no Python dependencies in the hot path&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The GPU path is still experimental (CPU mode is the safe default), but the dequant shaders and the fused matmul kernels were honestly the most fun part to write.&lt;/p&gt;

&lt;p&gt;I'm not trying to compete with llama.cpp or MLX, this was primarily a learning project that grew into something actually useful. Happy to answer questions or take feedback.&lt;/p&gt;

&lt;p&gt;Stack: Rust, WGPU, WGSL, GGUF, axum, Tokio&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/theoxfaber/aether" rel="noopener noreferrer"&gt;https://github.com/theoxfaber/aether&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(Full transparency, the majority of this code and post were written with AI assistance. I drove the design decisions, architecture, and testing; AI handled a lot of the implementation. Treat it accordingly.)&lt;/p&gt;

</description>
      <category>rust</category>
      <category>llm</category>
      <category>ai</category>
      <category>programming</category>
    </item>
    <item>
      <title>I built a pure-Rust browser automation library, no Node.js, no wrappers, just CDP over Tokio</title>
      <dc:creator>saripalli shanmukha kiran sagar</dc:creator>
      <pubDate>Sun, 10 May 2026 21:08:14 +0000</pubDate>
      <link>https://dev.to/theoxfaber/i-built-a-pure-rust-browser-automation-library-no-nodejs-no-wrappers-just-cdp-over-tokio-31hb</link>
      <guid>https://dev.to/theoxfaber/i-built-a-pure-rust-browser-automation-library-no-nodejs-no-wrappers-just-cdp-over-tokio-31hb</guid>
      <description>&lt;p&gt;I got tired of every Rust browser automation library either being a thin wrapper around Node.js (slow, heavy) or completely unmaintained and archived. So I built ferrous-browser a pure-Rust, async-first Chrome DevTools Protocol client that ships as a single binary.&lt;br&gt;
The core ideas behind it:&lt;/p&gt;

&lt;p&gt;Zero Node.js. It talks directly to Chrome over CDP using Tokio WebSockets. No npm, no subprocess bridges, nothing.&lt;br&gt;
Correct multi-page isolation. CDP session IDs are tracked per page so concurrent pages don't leak events into each other — something a lot of existing libraries quietly get wrong.&lt;/p&gt;

&lt;p&gt;Race-condition-free event handling. Event handlers are registered before the commands that trigger them, not after. Sounds obvious, but most implementations don't do this.&lt;/p&gt;

&lt;p&gt;A Playwright-inspired API. locator(), evaluate(), WaitUntil — familiar if you've used Playwright or Puppeteer, but idiomatic Rust.&lt;br&gt;
Here's the basic setup in Cargo.toml:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="py"&gt;ferrous-browser&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.1"&lt;/span&gt;
&lt;span class="py"&gt;tokio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"full"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Requires Chrome or Chromium installed locally. That's it.&lt;br&gt;
A minimal example, navigate, read a heading, take a screenshot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;ferrous_browser&lt;/span&gt;&lt;span class="p"&gt;::{&lt;/span&gt;&lt;span class="n"&gt;Browser&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WaitUntil&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nd"&gt;#[tokio::main]&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nb"&gt;Box&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;dyn&lt;/span&gt; &lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;error&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Browser&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;launch_chrome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="nf"&gt;.new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="nf"&gt;.goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;WaitUntil&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;heading&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="nf"&gt;.locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"h1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.inner_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Heading: {heading}"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;png&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="nf"&gt;.screenshot&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"screenshot.png"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;png&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The locator API covers click, type_text, wait_for, inner_text, and get_attribute. evaluate() is generic — you pass a JS expression and it deserializes into whatever Rust type you specify.&lt;br&gt;
For navigation, there are three wait modes: DomContentLoaded for speed, Load for full resource loading, and NetworkIdle which waits until no network activity for 500ms — useful for SPAs.&lt;br&gt;
On errors, every failure carries structured context. No more opaque strings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="nf"&gt;.goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://bad-url"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;WaitUntil&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;BrowserError&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;NavigationFailed&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reason&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="nd"&gt;eprintln!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Navigation to {url} failed: {reason}"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;BrowserError&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Timeout&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;secs&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt;
&lt;span class="nd"&gt;eprintln!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"{operation} timed out after {secs}s"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nd"&gt;eprintln!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"{e}"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also chain context onto any Result with .context("loading homepage")?.&lt;br&gt;
Compared to the existing Rust options, chromiumoxide (stale), headless_chrome (archived), ferrous-browser is the only one with an active locator API, NetworkIdle support, and structured errors out of the box.&lt;/p&gt;

&lt;p&gt;The benchmarks are honest: raw page creation is slower than Puppeteer right now (Chrome's session routing is heavily optimized on their end), but the gap closes fast when the workload is real scraping or E2E testing rather than micro-benchmarks.&lt;/p&gt;

&lt;p&gt;What's on the roadmap: cookie management, PDF export, evaluate_handle for remote object references, HAR/trace capture, and full Windows support.&lt;/p&gt;

&lt;p&gt;Would love feedback, especially from anyone who's hit the multi-page session isolation bugs in other libraries, that was the main itch I was scratching.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/theoxfaber/ferrous-browser" rel="noopener noreferrer"&gt;https://github.com/theoxfaber/ferrous-browser&lt;/a&gt;&lt;br&gt;
Crates.io: &lt;a href="https://crates.io/crates/ferrous-browser" rel="noopener noreferrer"&gt;https://crates.io/crates/ferrous-browser&lt;/a&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>webdev</category>
      <category>opensource</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
