<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: zaochuan5854</title>
    <description>The latest articles on DEV Community by zaochuan5854 (@zaochuan5854).</description>
    <link>https://dev.to/zaochuan5854</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3874246%2F99b7db3d-c7b0-43e6-b086-bef6f343bb87.png</url>
      <title>DEV Community: zaochuan5854</title>
      <link>https://dev.to/zaochuan5854</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zaochuan5854"/>
    <language>en</language>
    <item>
      <title>Stop Choosing Between Speed and LoRAs: Meet ComfyUI-TensorRT-Reforge 🚀</title>
      <dc:creator>zaochuan5854</dc:creator>
      <pubDate>Sun, 12 Apr 2026 01:09:30 +0000</pubDate>
      <link>https://dev.to/zaochuan5854/stop-choosing-between-speed-and-loras-meet-comfyui-tensorrt-reforge-59bc</link>
      <guid>https://dev.to/zaochuan5854/stop-choosing-between-speed-and-loras-meet-comfyui-tensorrt-reforge-59bc</guid>
      <description>&lt;h2&gt;
  
  
  👋 Introduction
&lt;/h2&gt;

&lt;p&gt;Hey ComfyUI creators! Have you ever found yourself generating images and thinking, &lt;strong&gt;"I really wish this was blazingly fast"&lt;/strong&gt;? &lt;/p&gt;

&lt;p&gt;If you've looked into accelerating AI model inference, you've probably heard of &lt;strong&gt;TensorRT&lt;/strong&gt;. While there are a few custom nodes out there that bring TensorRT to ComfyUI, they often come with frustrating trade-offs. You usually hear complaints like, &lt;em&gt;"I can't use my LoRAs anymore,"&lt;/em&gt; or &lt;em&gt;"The node is outdated and unmaintained..."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;To solve this, I've developed &lt;a href="https://github.com/zaochuan5854/ComfyUI-TensorRT-Reforge" rel="noopener noreferrer"&gt;ComfyUI-TensorRT-Reforge&lt;/a&gt;! 🚀 It's a brand-new custom node that lets you &lt;strong&gt;reap the benefits of TensorRT's insane speeds while still using your favorite LoRAs freely.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In this post, I'll walk you through how to set it up, how to use it, and dive into some of the cool tech working under the hood. Let's dive in! 👇&lt;/p&gt;

&lt;h4&gt;
  
  
  🙌 Acknowledgments
&lt;/h4&gt;

&lt;p&gt;This project builds upon the fantastic &lt;a href="https://github.com/comfyanonymous/ComfyUI_TensorRT" rel="noopener noreferrer"&gt;ComfyUI-TensorRT&lt;/a&gt; originally created by ComfyUI's author, &lt;a href="https://github.com/comfyanonymous" rel="noopener noreferrer"&gt;comfyanonymous&lt;/a&gt;. Huge thanks to them!&lt;/p&gt;




&lt;h2&gt;
  
  
  🛠️ What's in the Box?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ComfyUI-TensorRT-Reforge&lt;/code&gt; is kept simple and consists of two main custom nodes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;TensorRT Exporter Reforge&lt;/strong&gt; (The Exporter)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TensorRT Loader Reforge&lt;/strong&gt; (The Loader)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Their roles are straightforward. &lt;br&gt;
First, the &lt;strong&gt;Exporter&lt;/strong&gt; takes your standard &lt;code&gt;.safetensors&lt;/code&gt; model and converts it into a highly optimized TensorRT model. Next, the &lt;strong&gt;Loader&lt;/strong&gt; brings that converted model into ComfyUI, wrapping it so you can use it exactly like you would any normal model.&lt;/p&gt;


&lt;h2&gt;
  
  
  💻 System Requirements
&lt;/h2&gt;

&lt;p&gt;Here are the requirements and the environment I used for testing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;th&gt;Requirements&lt;/th&gt;
&lt;th&gt;My Test Environment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows 11/10, WSL, Ubuntu&lt;/td&gt;
&lt;td&gt;Docker on WSL (See Dockerfile below)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPU&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RTX 2000 series or newer&lt;/td&gt;
&lt;td&gt;RTX 4070 Ti&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VRAM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8GB minimum&lt;/td&gt;
&lt;td&gt;12GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CUDA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;12.x&lt;/td&gt;
&lt;td&gt;12.8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Models&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SD1.5, SDXL, AuraFlow, Flux, SD3, Anima, SVD&lt;/td&gt;
&lt;td&gt;SD1.5, SDXL, SD3, Anima&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Note on CUDA versions&lt;/strong&gt;&lt;br&gt;
CUDA 11 and CUDA 13 are not officially supported right now. However, you might get them to work by tweaking the ONNX/TensorRT versions or export options. If you manage to get it running on those versions, please drop a comment in our &lt;a href="https://github.com/zaochuan5854/ComfyUI-TensorRT-Reforge/discussions" rel="noopener noreferrer"&gt;Discussions section&lt;/a&gt;—I'd love to hear about it!&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  📦 Installation
&lt;/h2&gt;

&lt;p&gt;Installing &lt;code&gt;ComfyUI-TensorRT-Reforge&lt;/code&gt; is just as easy as any other custom node. Choose the method that best fits your workflow.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Via ComfyUI-Manager (Recommended)
&lt;/h3&gt;

&lt;p&gt;If you use ComfyUI-Manager, you're just a few clicks away:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click on &lt;strong&gt;[Manager]&lt;/strong&gt; in the ComfyUI menu.&lt;/li&gt;
&lt;li&gt;Open the &lt;strong&gt;[Custom Nodes Manager]&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Search for &lt;code&gt;TensorRT-Reforge&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Once you spot &lt;code&gt;ComfyUI-TensorRT-Reforge&lt;/code&gt;, hit &lt;strong&gt;[Install]&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restart&lt;/strong&gt; ComfyUI.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;💡 Can't find it in the Manager?&lt;/strong&gt;&lt;br&gt;
Since this project is brand new, it might take a moment to appear in the default list. If you don't see it, you can use the "Install via Git URL" feature in the Manager, or just fall back to the manual installation below.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  2. Manual Installation (git clone)
&lt;/h3&gt;

&lt;p&gt;For the terminal lovers, just navigate to your ComfyUI directory and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Navigate to the custom_nodes directory&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;custom_nodes

&lt;span class="c"&gt;# Clone the repository&lt;/span&gt;
git clone https://github.com/zaochuan5854/ComfyUI-TensorRT-Reforge

&lt;span class="c"&gt;# Install the required dependencies&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;ComfyUI-TensorRT-Reforge
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 How to Use It
&lt;/h2&gt;

&lt;p&gt;Let's walk through the workflow step-by-step. &lt;br&gt;
&lt;em&gt;(Pro-tip: You can drag and drop the workflow image at the bottom of this article directly into ComfyUI to import it!)&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Convert your model (The Exporter)
&lt;/h3&gt;

&lt;p&gt;Drop an Exporter node into your workspace and select the &lt;code&gt;.safetensors&lt;/code&gt; model you want to turbocharge. &lt;br&gt;
Next, configure your constraints: batch size, resolution range, and &lt;strong&gt;whether you want to enable LoRA&lt;/strong&gt;. Give it a prefix name, and hit queue!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcavbkxcwy3jbpxsbdd9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcavbkxcwy3jbpxsbdd9.png" alt="TensorRT-Reforge Exporter" width="708" height="1125"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;👆 In this example, I'm converting &lt;code&gt;anima-preview2.safetensor&lt;/code&gt; into a TensorRT model locked to a batch size of 1 and an exact 1024x1024 resolution.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Crucial Exporting Tips:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The conversion process can take anywhere from &lt;strong&gt;3 to 10 minutes&lt;/strong&gt;. Grab a coffee and be patient! ☕&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;If you don't enable LoRA here, you CANNOT apply LoRAs to this model later!&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;If you're using an Anima model or have LoRA enabled, the exporter will generate a custom &lt;code&gt;.bundle&lt;/code&gt; file (see the appendix for nerds).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;💡 Why the "Resolution Range"?&lt;/strong&gt;&lt;br&gt;
TensorRT requires you to strictly define the shape (size) of the data it will process beforehand. Locking it to a single, exact size yields the fastest speeds, but restricts you to that specific output size. (Setting min/max to &lt;code&gt;0&lt;/code&gt; leaves it unrestricted).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Step 2: Load the Engine (The Loader)
&lt;/h3&gt;

&lt;p&gt;Once the export is done, grab a Loader node. Select the shiny new model you just created and specify the Model Type.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbd3utujcjp5dzzoumii6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbd3utujcjp5dzzoumii6.png" alt="TensorRT-Reforge Loader" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Model Type is usually auto-detected from the filename, but it never hurts to double-check!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Step 3: Setup Latents and Generate!
&lt;/h3&gt;

&lt;p&gt;From here on out, just wire it up like your standard ComfyUI workflow and hit generate. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7qsvl940dksuw5kj02c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7qsvl940dksuw5kj02c.png" alt="TensorRT-Reforge AnimaWorkfow" width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🚨 ALERT: Watch your Latent Sizes!&lt;/strong&gt;&lt;br&gt;
If your empty latent image size or batch size conflicts with the constraints you set in the Exporter, ComfyUI will throw an error. If you forget your settings, just check the TensorRT filename—the constraints are usually tagged right there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw5krf1qb9eypb3rtzo84.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw5krf1qb9eypb3rtzo84.png" alt="TensorRT-Reforge Latent" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;👆 For example, here I'm using Width 1024, Height 1024, and Batch Size 1, perfectly matching my export settings.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  🤓 The Tech Behind It: How is it so fast AND supports LoRA?
&lt;/h2&gt;

&lt;p&gt;Time for some under-the-hood geekery! How exactly does this node manage to achieve dramatic speedups while still allowing dynamic LoRA swapping? &lt;/p&gt;
&lt;h3&gt;
  
  
  The Basics
&lt;/h3&gt;
&lt;h4&gt;
  
  
  TensorRT: The Custom-Built F1 Car
&lt;/h4&gt;

&lt;p&gt;Normally, when ComfyUI loads a &lt;code&gt;.safetensors&lt;/code&gt; file, the math is handled by PyTorch. PyTorch is incredibly flexible—like a highly reliable, all-terrain manual transmission vehicle. But that flexibility comes with a bit of processing overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TensorRT&lt;/strong&gt;, on the other hand, performs hyper-optimizations specifically for &lt;em&gt;your exact GPU&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kernel Fusion:&lt;/strong&gt; It takes separate mathematical operations (like ReLU and Conv) and smashes them together into single commands, drastically reducing memory access times.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimal Routing:&lt;/strong&gt; It automatically benchmarks and selects the absolute fastest algorithms your specific silicon can run.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of TensorRT as ditching the all-terrain vehicle to build a &lt;strong&gt;custom F1 car tuned to drive perfectly on one specific circuit.&lt;/strong&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  LoRA (Low-Rank Adaptation): Smart Delta Learning
&lt;/h4&gt;

&lt;p&gt;Instead of painfully modifying the massive, billions-of-parameters base model, LoRA alters the output by simply injecting tiny matrices. &lt;/p&gt;

&lt;p&gt;Mathematically, the update to the original weight matrix &lt;strong&gt;W&lt;/strong&gt; looks like this:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;Wupdated=W+ΔW=W+BAW_{updated} = W + \Delta W = W + BA &lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;W&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;u&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;p&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;d&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;a&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;t&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;e&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;Δ&lt;/span&gt;&lt;span class="mord mathnormal"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;B&lt;/span&gt;&lt;span class="mord mathnormal"&gt;A&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;



&lt;p&gt;Because matrices &lt;strong&gt;A&lt;/strong&gt; and &lt;strong&gt;B&lt;/strong&gt; are ridiculously small compared to the original, you get incredible stylistic control with barely any computational or storage overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enter "Refit": Making the Rigid Flexible
&lt;/h3&gt;

&lt;p&gt;Historically, TensorRT's biggest flaw for AI artists was its &lt;strong&gt;rigidity&lt;/strong&gt;. &lt;br&gt;
Because it physically "compiles" that highly optimized F1 engine, if you wanted to change the weights (like adding a new LoRA), you essentially had to spend several minutes rebuilding the entire F1 car from scratch. Not ideal for rapid iteration.&lt;/p&gt;

&lt;p&gt;The solution? A feature called &lt;strong&gt;Refit&lt;/strong&gt;.&lt;/p&gt;
&lt;h4&gt;
  
  
  What does Refit do?
&lt;/h4&gt;

&lt;p&gt;Refit allows us to rapidly overwrite the internal "weights" of the engine &lt;em&gt;without&lt;/em&gt; altering the underlying mathematical structure. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The skeleton stays intact:&lt;/strong&gt; TensorRT keeps all of its ultra-fast, fused kernel paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Injecting the payload:&lt;/strong&gt; We mark specific weight zones in advance ("Hey, we might change this later!"). When you swap a LoRA, we just pour the new LoRA math directly into those pre-marked slots.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;
  
  
  Why is this a game changer?
&lt;/h4&gt;

&lt;p&gt;By leveraging Refit, the previously agonizing process of swapping LoRAs in TensorRT is solved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lightning-fast swaps:&lt;/strong&gt; What used to take 5-10 minutes now takes a few seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero speed loss:&lt;/strong&gt; You maintain 100% of the insane inference speeds of TensorRT while freely hot-swapping LoRAs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;"TensorRT Speeds" × "LoRA Flexibility"&lt;/strong&gt;. We finally get the best of both worlds, and that's exactly what &lt;code&gt;ComfyUI-TensorRT-Reforge&lt;/code&gt; delivers!&lt;/p&gt;


&lt;h2&gt;
  
  
  📝 Appendix
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;a&gt;&lt;/a&gt;Workflow Example
&lt;/h3&gt;

&lt;p&gt;Drag and drop this image into your ComfyUI canvas to instantly import the workflow used in this article:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1c1xg9lfwspa3suq5mm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1c1xg9lfwspa3suq5mm.png" alt="image.png" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;a&gt;&lt;/a&gt;Dockerfile for Development
&lt;/h3&gt;

&lt;p&gt;If you prefer running ComfyUI in Docker, here is my setup. Feel free to tweak it to your needs (warning: building this takes a while!).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nv"&gt;DEBIAN_FRONTEND&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;noninteractive apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    tzdata &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-sf&lt;/span&gt; /usr/share/zoneinfo/Asia/Tokyo /etc/localtime &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Asia/Tokyo"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /etc/timezone &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get clean &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    build-essential curl git tmux nano htop lsyncd ssh-client fontconfig fonts-ipafont fonts-ipaexfont&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;fc-cache &lt;span class="nt"&gt;-fv&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; VIRTUAL_ENV=/opt/venv&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PATH="$VIRTUAL_ENV/bin:$PATH"&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /opt&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; UV_HTTP_TIMEOUT=600&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;uv venv &lt;span class="nv"&gt;$VIRTUAL_ENV&lt;/span&gt; &lt;span class="nt"&gt;--python&lt;/span&gt; 3.12 &lt;span class="nt"&gt;--seed&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv pip &lt;span class="nb"&gt;install &lt;/span&gt;torch torchvision torchaudio &lt;span class="nt"&gt;--index-url&lt;/span&gt; https://download.pytorch.org/whl/cu128 &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv pip &lt;span class="nb"&gt;install &lt;/span&gt;comfy-cli ComfyUI-EasyNodes beautifulsoup4 aiohttp_retry

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;n&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;echo &lt;/span&gt;y&lt;span class="o"&gt;)&lt;/span&gt; | comfy &lt;span class="nt"&gt;--workspace&lt;/span&gt; /opt/comfyui &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--nvidia&lt;/span&gt; &lt;span class="nt"&gt;--cuda-version&lt;/span&gt; 12.8&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; comfy &lt;span class="nt"&gt;--workspace&lt;/span&gt; /opt/comfyui node &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    ComfyUI-Impact-Pack &lt;span class="se"&gt;\
&lt;/span&gt;    ComfyUI-Manager &lt;span class="se"&gt;\
&lt;/span&gt;    was-node-suite-comfyui

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /opt/comfyui/custom_nodes &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git clone https://github.com/Suzie1/ComfyUI_Comfyroll_CustomNodes.git

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /opt/comfyui/custom_nodes &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git clone https://github.com/rgthree/rgthree-comfy.git

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /opt/comfyui/custom_nodes &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git clone https://github.com/cosmicbuffalo/comfyui-mobile-frontend.git

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; UV_INDEX_STRATEGY=unsafe-best-match&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /opt/comfyui/custom_nodes &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; git clone https://github.com/zaochuan5854/ComfyUI-TensorRT-Reforge.git &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;ComfyUI-TensorRT-Reforge &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /opt/comfyui&lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; COMFYUI_PATH="/opt/comfyui"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;a&gt;&lt;/a&gt;Bundle File Format
&lt;/h3&gt;

&lt;p&gt;For advanced users who want to extract the underlying &lt;code&gt;.engine&lt;/code&gt; or &lt;code&gt;.onnx&lt;/code&gt; files, here is the &lt;code&gt;.bundle&lt;/code&gt; layout. You can also reference the &lt;a href="https://github.com/zaochuan5854/ComfyUI-TensorRT-Reforge/blob/main/trt_utils.py" rel="noopener noreferrer"&gt;source code&lt;/a&gt; for parser logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## File Layout
Each data chunk's role is defined by its leading ID. The appearance order within the file is arbitrary.

+-----------------------------------------+ &amp;lt;--- Offset 0
| [ID:1B][Size:8B][Chunk Data...]         | Data Chunk A
+-----------------------------------------+
| [ID:1B][Size:8B][Chunk Data...]         | Data Chunk B
+-----------------------------------------+
| ...                                     | (Additional Chunks in any order)
+-----------------------------------------+ &amp;lt;--- End of Data Chunks (data_limit)
|                                         |
|      Metadata Section (JSON)            | Variable Length (No ID prefix)
|                                         |
+-----------------------------------------+ &amp;lt;--- Metadata End (EOF - 8 bytes)
|      Metadata Size (8 bytes)            | uint64, Little Endian
+-----------------------------------------+ &amp;lt;--- EOF

---

## ID Definition
- 0x01: TensorRT Engine Data
- 0x02: ONNX Model Data
- 0x03: WeightsMap (JSON / Binary)
- 0x04-0xFF: Reserved for future extensions

---

## Parsing Logic (ID-Driven)
1. Read the last 8 bytes of the file to get `meta_size`.
2. Calculate `data_limit` = (EOF - 8 - meta_size).
3. Initialize `current_offset = 0`.
4. While `current_offset &amp;lt; data_limit`:
    a. Read 1 byte as `chunk_id`.
    b. Read 8 bytes as `chunk_size`.
    c. Record `data_start = current_offset + 9`.
    d. Store the mapping of `chunk_id` -&amp;gt; `data_start`.
    e. Jump to the next chunk: `current_offset += (9 + chunk_size)`.
5. Seek to `data_limit` and parse the Metadata JSON.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>ai</category>
      <category>performance</category>
      <category>programming</category>
      <category>python</category>
    </item>
  </channel>
</rss>
