<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Bisina Daniel</title>
    <description>The latest articles on DEV Community by Bisina Daniel (@dbisina).</description>
    <link>https://dev.to/dbisina</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3593521%2F806990be-98f3-4f9d-8c49-78bcab90bbc5.jpg</url>
      <title>DEV Community: Bisina Daniel</title>
      <link>https://dev.to/dbisina</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dbisina"/>
    <language>en</language>
    <item>
      <title>Introducing U-HOP — Universal Hardware Optimization Protocol</title>
      <dc:creator>Bisina Daniel</dc:creator>
      <pubDate>Mon, 03 Nov 2025 01:36:37 +0000</pubDate>
      <link>https://dev.to/dbisina/introducing-u-hop-universal-hardware-optimization-protocol-2m6i</link>
      <guid>https://dev.to/dbisina/introducing-u-hop-universal-hardware-optimization-protocol-2m6i</guid>
      <description>&lt;p&gt;Modern AI workloads shouldn’t need to be rewritten for every device. Yet today, performance still depends heavily on vendor-specific frameworks, driver stacks, and hand-tuned kernels.&lt;/p&gt;

&lt;p&gt;U-HOP (Universal Hardware Optimization Protocol) is an open initiative to break that dependency by creating a unified optimization layer that lets compute run fast anywhere.&lt;/p&gt;

&lt;p&gt;Write once → run optimized across GPUs, CPUs, NPUs, TPUs, and edge accelerators.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What U-HOP Does&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;U-HOP dynamically selects the best compute backend and generates optimized kernels for the underlying hardware — automatically.&lt;/p&gt;

&lt;p&gt;Think of it as:&lt;/p&gt;

&lt;p&gt;A protocol that maps high-level ops to the best low-level execution path available at runtime.&lt;/p&gt;

&lt;p&gt;Initial focus areas:&lt;br&gt;
    • Matrix operations (matmul)&lt;br&gt;
    • Conv2D ops&lt;br&gt;
    • ReLU / activation pipelines&lt;br&gt;
    • Device introspection + runtime backend selection&lt;br&gt;
    • Foundations for future AI-generated kernel synthesis&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We’re moving toward a world where models run:&lt;br&gt;
    • On multi-GPU rigs&lt;br&gt;
    • On phones with NPUs&lt;br&gt;
    • On browser WebGPU&lt;br&gt;
    • On edge compute like Jetson / RK3588&lt;br&gt;
    • On future AI accelerators&lt;/p&gt;

&lt;p&gt;Fragmentation limits innovation.&lt;/p&gt;

&lt;p&gt;U-HOP’s goal is to unify compute execution and unlock “write once, run fast anywhere” for ML workloads — starting with real operator-level performance wins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Current Status (MVP Phase)&lt;/strong&gt;&lt;br&gt;
    • Runtime architecture defined&lt;br&gt;
    • Backend probing + dispatch in progress&lt;br&gt;
    • Core op specification (v0.1) drafted&lt;br&gt;
    • First demos in pipeline:&lt;br&gt;
    • matmul across heterogeneous devices&lt;br&gt;
    • ReLU + Conv2D proof runs&lt;br&gt;
    • Benchmarking vs naive exec paths&lt;/p&gt;

&lt;p&gt;Next milestone: AI-generated kernel optimization demo.&lt;/p&gt;

&lt;p&gt;Repo:&lt;br&gt;
&lt;a href="//github.com/sevenloops/uhop"&gt;github.com/sevenloops/uhop&lt;/a&gt; (active early-stage build)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get Involved&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We’re building in the open. If you’re passionate about:&lt;br&gt;
    • GPU architecture&lt;br&gt;
    • Kernel optimization&lt;br&gt;
    • Runtime compilers&lt;br&gt;
    • ONNX / CUDA / ROCm / WebGPU&lt;br&gt;
    • Edge acceleration&lt;br&gt;
    • AI-generated system code&lt;/p&gt;

&lt;p&gt;We’d love to collaborate.&lt;/p&gt;

&lt;p&gt;Comment. PR. Fork. Stress-test. Let’s build a new standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vision&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A protocol layer that eventually becomes the bridge between AI→hardware, enabling models — and future AI compilers — to target any compute substrate without rewriting code.&lt;/p&gt;

&lt;p&gt;Hardware becomes a capability layer, not a constraint.&lt;/p&gt;

&lt;p&gt;U-HOP is a first step toward that future.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Call to action&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Clone the repo &amp;amp; try the early dispatch tests:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git clone https://github.com/sevenloops/uhop&lt;br&gt;
cd uhop&lt;br&gt;
python tests/dispatch_demo.py&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Share feedback, ideas, challenges, and benchmarks.&lt;br&gt;
Let’s shape the protocol together.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nvidia</category>
      <category>amd</category>
      <category>gpu</category>
    </item>
  </channel>
</rss>
