<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yoshio Nomura</title>
    <description>The latest articles on DEV Community by Yoshio Nomura (@asterios07).</description>
    <link>https://dev.to/asterios07</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3735191%2F79d4c44c-24d1-4299-ae1c-8a1d1e5caac0.jpeg</url>
      <title>DEV Community: Yoshio Nomura</title>
      <link>https://dev.to/asterios07</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/asterios07"/>
    <language>en</language>
    <item>
      <title>🧑‍💻 GitHub Actions: Automating Edge LLMOps Deployments via CI/CD 🧑‍💻</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Fri, 27 Mar 2026 13:46:14 +0000</pubDate>
      <link>https://dev.to/asterios07/github-actions-automating-edge-llmops-deployments-via-cicd-41jl</link>
      <guid>https://dev.to/asterios07/github-actions-automating-edge-llmops-deployments-via-cicd-41jl</guid>
      <description>&lt;p&gt;❌ A brutal reality of edge computing: an architecture bound to a single local machine is not a deployment; it is a liability.&lt;/p&gt;

&lt;p&gt;👉 In Phase 7, I engineered a Merchant of Record (MoR) Kubernetes cluster capable of trans-continental B2B routing. However, the deployment execution remained localized. Today, in Phase 8, I severed that physical tether by injecting a strict CI/CD pipeline.&lt;/p&gt;

&lt;p&gt;🟢 1. The Hyperscaler Handoff: Global capital demands zero points of failure. The compilation of the inference container and the structural auditing of the K3s declarative state have been completely offloaded to GitHub Actions.&lt;/p&gt;

&lt;p&gt;🟢 2. The Immutable Verification: Upon every merge to the "enterprise-saas-mor" branch, the pipeline executes a multi-stage Docker build, utilizes gha caching to minimize compute time, and strictly performs a --dry-run audit across the entire Kubernetes observability and stateful ledger manifests. If a single YAML indentation is flawed, the pipeline strictly blocks the deployment.&lt;/p&gt;

&lt;p&gt;🟢 3. The Decoupled Reality: The local hardware is now strictly an environment for experimentation. The production matrix is verified, built, and staged globally, rendering the physical origin of the code irrelevant to its execution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazkhtw25vm7i0k6clacq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazkhtw25vm7i0k6clacq.png" alt="Successful GitHub CI flow" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;CI validation is not merely a step in DevOPs, it serves as the bridge towards stable local infrastructure and immaculate deployment pipelines on Cloud.&lt;/p&gt;

&lt;p&gt;The workflow YAML is located in the GitHub Repository of the LLMOps, specified in the "enterprise-saas-mor" branch.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>devops</category>
      <category>githubactions</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>📈Visualizing the Edge: Translating Kubernetes Telemetry into Financial Throughput 📉</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Wed, 25 Mar 2026 09:44:19 +0000</pubDate>
      <link>https://dev.to/asterios07/visualizing-the-edge-translating-kubernetes-telemetry-into-financial-throughput-3404</link>
      <guid>https://dev.to/asterios07/visualizing-the-edge-translating-kubernetes-telemetry-into-financial-throughput-3404</guid>
      <description>&lt;p&gt;👉 A harsh reality of building LLMOps platforms: if you cannot visualize your traffic, your trans-continental routing is effectively a black box.&lt;/p&gt;

&lt;p&gt;In the previous phase of my edge architecture, I orchestrated a fault-tolerant K3s control plane and injected Prometheus to scrape the B2B inference nodes. Today, I materialized the business value by deploying the Grafana observability matrix directly into the cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyoypt9vljavyz4dbbueb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyoypt9vljavyz4dbbueb.png" alt="grafana graph with k8s" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🟢 1. The Declarative Datasource&lt;br&gt;
The infrastructure is defined by code, not manual UI clicks. Grafana boots with a pre-configured, immutable ConfigMap binding it strictly to the internal Prometheus service. If a node dies, the visualization re-spawns instantly with the exact same state.&lt;/p&gt;

&lt;p&gt;🟢 2. The Financial Translation&lt;br&gt;
This isn't just about tracking CPU memory limits. It is about cryptographic financial validation. By importing custom dashboards and observing the simulated trans-continental traffic, I have translated raw Kubernetes compute into a verifiable financial ledger.&lt;/p&gt;

&lt;p&gt;🟢 3. The Stress Test&lt;br&gt;
Operating under the strict resource constraints of consumer edge silicon, the architecture successfully mitigated a 50-user concurrent swarm. The NGINX ingress routed the payloads, the API keys were cryptographically verified against the PostgreSQL ledgers, and the distributed tokens were deducted with sub-millisecond latency.&lt;/p&gt;

&lt;p&gt;The architecture is now a fully observable Merchant of Record (MoR) perimeter.&lt;br&gt;
Link: &lt;a href="https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>llm</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>💫 Securing Global B2B Routing: Pivotal Edge Engineering 💫</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Mon, 23 Mar 2026 09:54:09 +0000</pubDate>
      <link>https://dev.to/asterios07/securing-global-b2b-routing-pivotal-edge-engineering-44n9</link>
      <guid>https://dev.to/asterios07/securing-global-b2b-routing-pivotal-edge-engineering-44n9</guid>
      <description>&lt;p&gt;👉 A brutal truth of LLMOps using edge engineering: your hardware will fail before your software architecture does.&lt;/p&gt;

&lt;p&gt;‼️ Operating a horizontally scaled Kubernetes (K3s) control plane from a consumer-grade node to capture global enterprise payloads presents a strict physical boundary. Virtualization layers (WSL2, containerd) inevitably fracture the hardware bridge. When a multi-billion parameter LLM model attempts to load into VRAM without absolute runtime configured (NVIDIA Container Runtime, for instance), the API server panics and the architecture enters a permanent CrashLoopBackOff. ❌ &lt;/p&gt;

&lt;p&gt;I did not sacrifice the architecture to appease the hardware. Here is what I would have done:&lt;/p&gt;

&lt;p&gt;🟢 1. The Amputation: The heavy INT8 quantization and Hugging Face tensor allocations were physically stripped from the ASGI event loop, replaced with a lightweight endpoint response. &lt;/p&gt;

&lt;p&gt;✅ This was done on the presumption that the code works fine with the LLMOps tensors injected using LoRA. However, we strip it to test other deployment features without facing the risk of hitting the storage limit or CPU throttle.&lt;/p&gt;

&lt;p&gt;🟢 2. The Stateful Validation: With the CPU limits protected, the true enterprise matrix booted in fractions of a second. The stateless worker swarms initialized, mathematically locking into the PostgreSQL billing ledgers and the distributed Redis token buckets. The NGINX ingress controller instantly achieved stateful equilibrium.&lt;/p&gt;

&lt;p&gt;✅ This ensures other DevOps operations (routing, metrics) are successfully connected to the main FastAPI application without much wait time due to heavy LLM tensors.&lt;/p&gt;

&lt;p&gt;🟢 3. The Global Perimeter: Cryptographic SaaS webhooks are now mathematically validated. Prometheus scrapes the headless matrix every 15 seconds, proving the sub-millisecond latency of the trans-continental routing. The infrastructure is entirely observable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp5avp2ntrp8jptdday7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp5avp2ntrp8jptdday7.png" alt="Kubectl Prometheus execute" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;✅ DevOps secured, both security and observability are online!&lt;/p&gt;

&lt;p&gt;The generated text of an AI model is transient data. The fault-tolerant structure that routes, limits, and monetizes it is the only enduring reality. Do not burn your control plane attempting to force heavy inference on isolated hardware. Prioritize the routing, and deploy the perimeter.&lt;/p&gt;

&lt;p&gt;For the LLMOps infrastructure, the codebase is entirely open-sourced on GitHub repo, secured on the branch "enterprise-saas-mor".&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>devops</category>
      <category>kubernetes</category>
      <category>llm</category>
    </item>
    <item>
      <title>❌ The Death of the Monolith: Tearing down Docker Compose for Kubernetes Orchestration ❌</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Fri, 20 Mar 2026 14:02:47 +0000</pubDate>
      <link>https://dev.to/asterios07/the-death-of-the-monolith-tearing-down-docker-compose-for-kubernetes-orchestration-4i3</link>
      <guid>https://dev.to/asterios07/the-death-of-the-monolith-tearing-down-docker-compose-for-kubernetes-orchestration-4i3</guid>
      <description>&lt;p&gt;My Phase 6 matrix utilized a single-node Docker Compose bridge. It was mathematically sufficient for localized testing, but when subjected to trans-continental concurrent payloads, the physical GPU limits of a single machine create a single point of failure. You cannot enforce a B2B SLA on a localized monolith.&lt;/p&gt;

&lt;p&gt;👉 Today, I physically tore down the Docker Compose bridge and initiated Phase 7: Distributed Edge Orchestration using K3s (Lightweight Kubernetes).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91s63cbtp9xa43pryq5m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91s63cbtp9xa43pryq5m.png" alt="IDE showing Docker-c down and Kubernetes active" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🤔 The Architectural Decoupling:&lt;/p&gt;

&lt;p&gt;🟢 Stateful Ledgers (The Capital): The PostgreSQL billing ledger and the Redis token bucket are now isolated into strict StatefulSet and Service manifests. They persist regardless of hardware degradation.&lt;/p&gt;

&lt;p&gt;🟢 Stateless Swarm (The Compute): The FastAPI inference nodes (running the Flan-T5 LoRA matrix) are now deployed as horizontally scaling Deployments.&lt;/p&gt;

&lt;p&gt;Why Kubernetes over Hyperscalers? K8s strips out cloud-provider bloat, allowing me to orchestrate bare-metal edge nodes with zero ongoing capital expenditure. If one physical worker agent hits its thermal ceiling, the ingress controller routes the traffic to the next available agent instantly.&lt;/p&gt;

&lt;p&gt;✅ The architecture is no longer a localized script. It is a distributed, fault-tolerant control plane.&lt;/p&gt;

&lt;p&gt;The YAML files reserved for Kubernetes Orchestration is open-sourced on GitHub Repository at the "enterprise-saas-mor" branch.&lt;/p&gt;

&lt;p&gt;Link: &lt;a href="https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>docker</category>
      <category>containers</category>
    </item>
    <item>
      <title>Sovereign API Constraints: Ripping out Stripe for a Global Merchant of Record (MoR)</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Wed, 18 Mar 2026 11:32:37 +0000</pubDate>
      <link>https://dev.to/asterios07/sovereign-api-constraints-ripping-out-stripe-for-a-global-merchant-of-record-mor-b16</link>
      <guid>https://dev.to/asterios07/sovereign-api-constraints-ripping-out-stripe-for-a-global-merchant-of-record-mor-b16</guid>
      <description>&lt;p&gt;🚀 Engineering a zero-trust local LLM cluster is only 50% of the B2B equation. The other 50% is global capital extraction.&lt;/p&gt;

&lt;p&gt;I initially engineered my Phase 6 Edge Cluster using the Stripe SDK for token metering and webhook synchronization. The cryptography was mathematically sound, and the asynchronous event loop preserved my p95 inference latency.&lt;/p&gt;

&lt;p&gt;✖️ Then I hit the sovereign API perimeter. Standard Stripe accounts do not legally support business entities natively operating in my geographic region.&lt;/p&gt;

&lt;p&gt;When your architecture collides with geopolitical reality, you do not mourn the dependency; you rip it out. I immediately bifurcated the repository and hot-swapped the financial gateway to a Merchant of Record (Lemon Squeezy).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;secret_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LEMON_SQUEEZY_WEBHOOK_SECRET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expected_signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compare_digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expected_signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sig_header&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cryptographic signature spoofing detected. Connection dropped.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Invalid signature.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🔻 Why an MoR is mathematically superior for solo global SaaS 🔻&lt;/p&gt;

&lt;p&gt;🟢 Zero Tax Liability: The MoR acts as the legal reseller. They calculate and remit global VAT/Sales Tax, completely severing my localized node from international tax compliance overhead.&lt;/p&gt;

&lt;p&gt;🟢 Geographic Inclusion: They natively support global payouts without requiring the friction and capital burn of US LLC virtualization (Stripe Atlas).&lt;/p&gt;

&lt;p&gt;🟢 The Webhook Parity: The cryptographic webhook perimeter I built for Stripe was seamlessly mapped to the MoR's X-Signature HMAC-SHA256 payloads. The architecture remains strictly consumption-based. When the MoR webhook fires an order_created event, my localized PostgreSQL database asynchronously translates that USD into LLM inference tokens.&lt;/p&gt;

&lt;p&gt;The core architecture is located at the "enterprise-saas-mor" branch on my  GitHub Repository.&lt;br&gt;
💣Link: &lt;a href="https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops/tree/enterprise-saas-mor&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>apigateway</category>
      <category>api</category>
      <category>devops</category>
    </item>
    <item>
      <title>✍ Architecting Enterprise SaaS: JWTs to synchronous payment gateways. ✍</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Mon, 16 Mar 2026 12:01:06 +0000</pubDate>
      <link>https://dev.to/asterios07/architecting-enterprise-saas-jwts-to-synchronous-payment-gateways-99o</link>
      <guid>https://dev.to/asterios07/architecting-enterprise-saas-jwts-to-synchronous-payment-gateways-99o</guid>
      <description>&lt;p&gt;❌ A common point of failure in scaling local LLM infrastructure into a B2B SaaS is the friction between authorization and financial state.&lt;/p&gt;

&lt;p&gt;Localized sessions or database-heavy auth checks fracture when you introduce a payment gateway like Stripe. Every millisecond spent querying a database to verify a user's session is a millisecond stolen from the ASGI event loop and the GPU inference queue.&lt;/p&gt;

&lt;p&gt;👉 To bypass this, I engineered a bifurcated authorization matrix for the Phase 6 Edge Cluster:&lt;/p&gt;

&lt;p&gt;✅ 1. The Stateless Perimeter (JWT): We utilize JSON Web Tokens for pure cryptographic authorization. Once issued, the FastAPI routers do not query PostgreSQL to verify identity. The cryptography proves the user's right to access the inference endpoint, dropping authorization latency to near-zero.&lt;/p&gt;

&lt;p&gt;✅ 2. The Stateful Ledger (Stripe + PostgreSQL): While identity is stateless, capital is strictly stateful. We must guarantee that no user can exist in our system without a corresponding billing ledger.&lt;/p&gt;

&lt;p&gt;🟢 Here is how I implemented the interception layer:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F21ed1jio01ihjcatewyh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F21ed1jio01ihjcatewyh.png" alt="Stripe setup" width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;During the /register execution, we inject a synchronous call to the Stripe API before the database commits the local user creation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxapvtalqh1ks0k9q0mgt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxapvtalqh1ks0k9q0mgt.png" alt="Auth with Stripe" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If the upstream Stripe network times out, the FastAPI router violently aborts the transaction. This mathematically guarantees zero orphaned accounts. If the transaction succeeds, the Stripe Customer ID is etched directly into the PostgreSQL row, binding the global financial network to the local LLM node.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core infrastructure remains open-source on GitHub. &lt;br&gt;
Link: &lt;a href="https://github.com/UniverseScripts/llmops" rel="noopener noreferrer"&gt;https://github.com/UniverseScripts/llmops&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>The Death of Transient Memory: Engineering a Zero-Cost B2B LLM Edge Cluster</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Thu, 12 Mar 2026 04:05:50 +0000</pubDate>
      <link>https://dev.to/asterios07/the-death-of-transient-memory-engineering-a-zero-cost-b2b-llm-edge-cluster-3c2</link>
      <guid>https://dev.to/asterios07/the-death-of-transient-memory-engineering-a-zero-cost-b2b-llm-edge-cluster-3c2</guid>
      <description>&lt;p&gt;A functional local inference node is merely a prototype. An observable, stateful inference node is enterprise infrastructure.&lt;/p&gt;

&lt;p&gt;The current standard of wrapping quantized LLMs in basic FastAPI endpoints and exposing them to the global internet is fundamentally flawed. When subjected to concurrent B2B payloads, in-memory token buckets fracture. The ASGI event loop bottlenecks, GPU VRAM fragments, and the node dies silently. &lt;/p&gt;

&lt;p&gt;To eradicate transient memory anomalies and bypass hyperscaler billing, I engineered a fully distributed, zero-trust Docker bridge matrix.&lt;/p&gt;

&lt;p&gt;Here is the strict architectural progression of the edge cluster.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Compute Layer (Bypassing VRAM Fragmentation)
&lt;/h3&gt;

&lt;p&gt;Standard localized nodes load unoptimized FP32 tensors, instantly saturating consumer hardware.&lt;br&gt;
This matrix utilizes &lt;code&gt;google/flan-t5-base&lt;/code&gt; with 8-bit precision (&lt;code&gt;BitsAndBytesConfig&lt;/code&gt;). To allow enterprise-specific instruction alignment without full-parameter overhead, the base model is merged with a Low-Rank Adaptation (LoRA) via &lt;code&gt;peft&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The State Layer (Eradicating Localized Memory)
&lt;/h3&gt;

&lt;p&gt;FastAPI &lt;code&gt;dict&lt;/code&gt; objects cannot manage concurrent state. We strictly externalize it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Authorization:&lt;/strong&gt; API keys are validated against a persistent PostgreSQL volume using asynchronous non-blocking I/O (&lt;code&gt;asyncpg&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Rate Limiting:&lt;/strong&gt; A localized Redis container executes asynchronous Lua pipelines (&lt;code&gt;transaction=True&lt;/code&gt;). This guarantees atomic evaluations of payload frequency, violently returning HTTP 429s to hostile actors before they penetrate the inference queue.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. The Routing Layer (Zero-Trust Isolation)
&lt;/h3&gt;

&lt;p&gt;The application layer is entirely severed from the localized host environment to prevent kernel port collisions.&lt;br&gt;
All internal traffic routes through an isolated Traefik reverse proxy. Global ingress is handled via a direct HTTP2 TCP Cloudflare tunnel, bypassing hypervisor UDP limits and local firewall ACLs entirely.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The Observability Matrix
&lt;/h3&gt;

&lt;p&gt;Infrastructure without telemetry is a black box. &lt;br&gt;
Prometheus silently scrapes the Uvicorn workers every 5 seconds. The TSDB is explicitly whitelisted from the Redis token bucket to prevent a self-inflicted denial of service. Grafana is provisioned via Infrastructure as Code (IaC), etching the dashboards directly into the container state.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Chaos Engineering Benchmark
&lt;/h3&gt;

&lt;p&gt;To mathematically prove the architecture's load-bearing capability, the node was subjected to a 150-concurrent-user synthetic swarm utilizing Locust.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9kxgcdkt9g9ymfq0rwi0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9kxgcdkt9g9ymfq0rwi0.png" alt="time-series graph delineating request latency per endpoint visit" width="800" height="286"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Famidjjompv5d8inp575h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Famidjjompv5d8inp575h.png" alt="time-series graph delineating 429 Response logs" width="800" height="292"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The telemetry proves the mathematical truth: &lt;/p&gt;

&lt;p&gt;The atomic Redis transactions successfully identified the payload overflow in milliseconds, aggressively returning HTTP 429s. The Uvicorn workers remained shielded, and the p95 latency for accepted trans-continental payloads remained perfectly stable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The complete, verifiable infrastructure is open-sourced here:&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://github.com/UniverseScripts/llmops" rel="noopener noreferrer"&gt;GitHub Repository Link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Question for the infrastructure architects: When load-balancing edge inference, are you standardizing on Traefik or native Nginx for your internal Docker DNS resolution? Defend your routing latency below.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>architecture</category>
      <category>python</category>
      <category>security</category>
    </item>
    <item>
      <title>‼️ The Architecture of Local LLMOps Collapse: Why Your FastAPI Inference Node is Failing. ‼️</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Wed, 04 Mar 2026 14:38:50 +0000</pubDate>
      <link>https://dev.to/asterios07/-the-architecture-of-local-llmops-collapse-why-your-fastapi-inference-node-is-failing--198b</link>
      <guid>https://dev.to/asterios07/-the-architecture-of-local-llmops-collapse-why-your-fastapi-inference-node-is-failing--198b</guid>
      <description>&lt;p&gt;🤔 The assumption that a standard ASGI framework can natively serve synchronous, quantized LLM tensors is flawed. In architecting a localized RAG node, the baseline open-source stack guarantees infrastructure collapse across three distinct reasons.&lt;/p&gt;

&lt;p&gt;👉 Here is the breakdown of the failure states and the required enterprise optimizations:&lt;/p&gt;

&lt;h2&gt;
  
  
  The Concurrency Gridlock
&lt;/h2&gt;

&lt;p&gt;Executing a Hugging Face model.generate() call inside a native FastAPI route paralyzes the core event loop. Standard tensor mathematics block the thread. Under concurrent B2B traffic, the node hangs indefinitely. &lt;/p&gt;

&lt;p&gt;✅ Fix: State isolation and threadpool offloading. Bind the quantized model directly to app.state during the lifespan boot, and utilize starlette.concurrency to push the synchronous generation matrix outside the ASGI loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python

from fastapi import APIRouter, HTTPException, Request
from schemas.generate import GenerateContext, GenerateResponse
import torch
import starlette.concurrency as concurrency

router = APIRouter(prefix="/generate", tags=["generate"])

def synchronous_generation(prompt: str, model, tokenizer, max_new_tokens: int) -&amp;gt; str:
    inputs = tokenizer(prompt, return_tensors="pt", max_length=256, truncation=True).to(model.device)

    with torch.no_grad():
        outputs = model.generate(
            input_ids=inputs["input_ids"],
            max_new_tokens=max_new_tokens,
            temperature=0.3,
            do_sample=True,
        )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

@router.post("/", response_model=GenerateResponse)
async def GenerateRequest(payload: GenerateContext, request: Request):

    model = getattr(request.app.state, "model", None)
    tokenizer = getattr(request.app.state, "tokenizer", None)

    if model is None or tokenizer is None:
        raise HTTPException(status_code=503, detail="Model uninitialized in VRAM")

    prompt = f"Instruction: {payload.instructions}\n Context: {payload.context}\n Response:"

    try:

        result = await concurrency.run_in_threadpool(
            synchronous_generation,
            prompt,
            model,
            tokenizer,
            payload.max_new_tokens,
        )

        return GenerateResponse(completion=result)

    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Virtualization Blindspot
&lt;/h2&gt;

&lt;p&gt;Deploying an nvidia/cuda Docker container on a Windows host running legacy OEM drivers (e.g., NVIDIA 451.xx) results in a catastrophic WDDM routing failure. The WSL2 hypervisor cannot bridge the physical hardware to the container daemon. &lt;/p&gt;

&lt;p&gt;✅ Fix: Force a clean installation of modern NVIDIA Studio Drivers to ensure the host machine projects CUDA 11.8+ capability through the virtualization layer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0u325ay91bylzcc4tqz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0u325ay91bylzcc4tqz.png" alt="Split terminal running " width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Dependency Hell
&lt;/h2&gt;

&lt;p&gt;Attempting to inject a LoRA adapter trained on peft==0.18.1 into an inference environment stabilized on peft==0.6.2 triggers fatal schema validation errors (alora_invocation_tokens). &lt;/p&gt;

&lt;p&gt;👉 Fix: Dependency immutability is non-negotiable. Do not upgrade the container to match the metadata. Surgically prune the JSON configuration payload, eradicating all experimental parameters to force backward compatibility with the stable 0.6.2 inference engine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcp25rnude2h288wgmlx4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcp25rnude2h288wgmlx4.png" alt="Fatal Python Traceback due to conflicting and missing modules" width="800" height="145"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Stop relying on optimistic tutorials. True engineering requires forcing unstable tools into strict compliance.&lt;/p&gt;

&lt;p&gt;🧑‍💻 The fully stabilized, Dockerized boilerplate for this inference node is available for inspection and deployment here: &lt;a href="https://github.com/UniverseScripts/llmops" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Building an Extraction Node: Analyzing 400+ HN Job Listings (Python vs Node.js)</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Sun, 22 Feb 2026 01:39:34 +0000</pubDate>
      <link>https://dev.to/asterios07/building-an-extraction-node-analyzing-400-hn-job-listings-python-vs-nodejs-1ga7</link>
      <guid>https://dev.to/asterios07/building-an-extraction-node-analyzing-400-hn-job-listings-python-vs-nodejs-1ga7</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;The Inefficiency of the Job Market&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The modern technical job hunt operates on an asymmetrical information model. Candidates manually process unstructured text across disparate platforms, while corporations utilize automated applicant tracking systems to filter them out. The logical countermeasure is to construct a programmatic extraction pipeline to identify the true market signal.&lt;/p&gt;

&lt;p&gt;To bypass the saturated and often misleading postings on mainstream corporate networks, the data source must be raw and developer-centric. This system utilizes the Hacker News "Who is Hiring" thread as the primary target for extraction.&lt;/p&gt;

&lt;p&gt;Below is the architectural breakdown of how to build an extraction node to parse, categorize, and synthesize 400+ unstructured job listings into a structured dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Extraction Pipeline
&lt;/h2&gt;

&lt;p&gt;Unstructured text from forums presents a parsing challenge. Traditional regex patterns fail when human formatting is inconsistent. The pipeline must operate in two phases: retrieval and synthesis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Retrieval
&lt;/h3&gt;

&lt;p&gt;Standard HTML parsing is sufficient for the initial extraction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python

import requests
from bs4 import BeautifulSoup
import json

def fetch_hn_thread(item_id: str) -&amp;gt; list:
    """Retrieves all top-level comments from an HN Who is Hiring thread."""
    url = f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
    response = requests.get(url).json()

    comments = []
    if 'kids' in response:
        for child_id in response['kids']:
            child_url = f"https://hacker-news.firebaseio.com/v0/item/{child_id}.json"
            child_data = requests.get(child_url).json()
            if child_data and 'text' in child_data:
                comments.append(child_data['text'])

    return comments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Phase 2: LLM Synthesis
&lt;/h3&gt;

&lt;p&gt;Once the raw HTML strings are retrieved, an LLM endpoint (e.g., Llama 3 or a structured output API) is required to enforce a JSON schema on the unstructured text. This isolates specific variables: Role, Stack, Salary, Remote Status, and Visa Sponsorship.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python

# System prompt engineering for deterministic output
schema_prompt = """
Extract the following fields from the job posting. 
Return ONLY valid JSON.
{
  "company": "string",
  "role": "string",
  "stack": ["string"],
  "remote": "Global" | "US Only" | "None",
  "visa_sponsorship": boolean,
  "salary_min": number | null,
  "salary_max": number | null
}
"""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. The Data Synthesis
&lt;/h2&gt;

&lt;p&gt;Running this pipeline against the February 2026 data yielded over 400+ discrete technical roles. The empirical data contradicts several prevailing market narratives.&lt;/p&gt;

&lt;p&gt;The Remote Distribution:&lt;/p&gt;

&lt;p&gt;Global Remote: 37%&lt;/p&gt;

&lt;p&gt;US-Only Remote: 22%&lt;/p&gt;

&lt;p&gt;On-Site / Hybrid: 41%&lt;/p&gt;

&lt;p&gt;Conclusion: Remote work is not dead, but it is heavily geofenced. Applying to roles without verifying the geographic constraint results in a 22% baseline failure rate for international candidates.&lt;/p&gt;

&lt;p&gt;Visa Sponsorship Metrics:&lt;/p&gt;

&lt;p&gt;Only 14% of the extracted listings explicitly offer visa sponsorship.&lt;/p&gt;

&lt;p&gt;80% of these sponsorships are localized within AI Infrastructure and Fintech sectors.&lt;/p&gt;

&lt;p&gt;The Technology Stack Premium:&lt;/p&gt;

&lt;p&gt;Python backend roles currently demonstrate a 15% salary premium over equivalent Node.js roles within this dataset.&lt;/p&gt;

&lt;p&gt;The market is signaling a rotation away from generalist JavaScript environments toward specialized, compute-heavy infrastructure languages (Python, Go, Rust).&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Execution and Deployment
&lt;/h2&gt;

&lt;p&gt;The architecture detailed above is sufficient for any engineer to reconstruct this pipeline locally. Maintaining local scripts for data extraction provides a compounding advantage in market awareness.&lt;/p&gt;

&lt;p&gt;For those currently navigating the job market who require the immediate output without configuring the pipeline or absorbing the LLM inference costs, the compiled CSV dataset—containing the 400+ parsed roles, technology stacks, and verified global remote tags—is accessible here:&lt;br&gt; &lt;a href="https://job-scrapper-ai.streamlit.app" rel="noopener noreferrer"&gt;https://job-scrapper-ai.streamlit.app&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>discuss</category>
      <category>career</category>
    </item>
    <item>
      <title>How I built a Tinder-style Card Swipe in Next.js 16</title>
      <dc:creator>Yoshio Nomura</dc:creator>
      <pubDate>Tue, 27 Jan 2026 12:57:33 +0000</pubDate>
      <link>https://dev.to/asterios07/how-i-built-a-tinder-style-card-swipe-in-nextjs-16-592h</link>
      <guid>https://dev.to/asterios07/how-i-built-a-tinder-style-card-swipe-in-nextjs-16-592h</guid>
      <description>&lt;p&gt;I recently decided to build a mobile-first roommate finder app as a way to learn the new &lt;strong&gt;Next.js 16 App Router&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The hardest part? Building a "Card Stack" that feels like a native app (Tinder-style) without using heavy libraries like &lt;code&gt;framer-motion&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is how I solved it using just React 19, Tailwind CSS, and good old &lt;code&gt;useState&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Here is the gig!
&lt;/h2&gt;

&lt;p&gt;"Moving" a stack of DOM elements was never a thing. We have been rendering &lt;strong&gt;one&lt;/strong&gt; active card and changing the data behind it all this time!&lt;/p&gt;

&lt;p&gt;To do this, we need to track two things in our state:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Index:&lt;/strong&gt; Which item in the array are we looking at?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Direction:&lt;/strong&gt; Is the user swiping Left (Reject) or Right (Like)?&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Code
&lt;/h2&gt;

&lt;p&gt;Here is the simplified logic from my &lt;code&gt;ExplorePage&lt;/code&gt; component:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;SwipeStack&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Track which card is active&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;currentIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setCurrentIndex&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Track animation direction ('left' | 'right' | null)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;swipeDirection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setSwipeDirection&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleSwipe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Step A: Trigger the animation&lt;/span&gt;
    &lt;span class="nf"&gt;setSwipeDirection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;direction&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Step B: Wait for animation to finish, then show next card&lt;/span&gt;
    &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setCurrentIndex&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;setSwipeDirection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Reset animation&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Matches CSS transition duration&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentItem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;currentIndex&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;currentItem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;No more profiles!&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"relative w-full max-w-sm h-[500px]"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; 
        &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;`
          transition-all duration-300 ease-out
          &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;swipeDirection&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;left&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;-translate-x-full -rotate-12 opacity-0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;swipeDirection&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;right&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;translate-x-full rotate-12 opacity-0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
        `&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* Your Card Component */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Card&lt;/span&gt; &lt;span class="na"&gt;item&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;currentItem&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* Control Buttons */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"flex gap-4 justify-center mt-8"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;handleSwipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;left&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;❌&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt; &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;handleSwipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;right&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;💚&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check this out, with the code, you get:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Instant Feedback: When you click "Like", setSwipeDirection('right') adds the Tailwind classes translate-x-full and rotate-12. The card visually flies off the screen.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;State Update: The setTimeout waits exactly 300ms (the duration of our CSS transition). Once the card is off-screen, we increment currentIndex.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reset: React re-renders with the new data at the same position (center), and we remove the animation classes. To the user, it looks like a brand new card appeared behind the old one.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Result&lt;br&gt;
This creates a buttery smooth 60fps animation on mobile browsers because we are only transforming translate and opacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  WAIT
&lt;/h2&gt;

&lt;p&gt;I open-sourced the entire UI kit, including the Swipe Logic, Bottom Navigation, and Chat Interface.&lt;/p&gt;

&lt;p&gt;You can grab the repo here to see how I handled the array filtering and mobile safe-areas:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/UniverseScripts/nextjs-marketplace-free" rel="noopener noreferrer"&gt;GitHub Repo: Next.js Mobile Marketplace Starter&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://nextjs-marketplace-free.vercel.app/" rel="noopener noreferrer"&gt;Live Demo&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me know if you have questions about the framework!&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>react</category>
      <category>nextjs</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
