<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: local ai</title>
    <description>The latest articles on DEV Community by local ai (@local_ai_28441e061d716cb1).</description>
    <link>https://dev.to/local_ai_28441e061d716cb1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3692479%2F08465eeb-94d4-4ebf-ae41-044c2219ff22.png</url>
      <title>DEV Community: local ai</title>
      <link>https://dev.to/local_ai_28441e061d716cb1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/local_ai_28441e061d716cb1"/>
    <language>en</language>
    <item>
      <title>【2026】Générer automatiquement des figures scientifiques avec l’IA – Fini Illustrator</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Wed, 01 Apr 2026 14:03:49 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/2026-generer-automatiquement-des-figures-scientifiques-avec-lia-fini-illustrator-hei</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/2026-generer-automatiquement-des-figures-scientifiques-avec-lia-fini-illustrator-hei</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Pour les chercheurs qui rédigent des articles scientifiques, la &lt;strong&gt;création de figures (Figures)&lt;/strong&gt; est l'une des tâches les plus chronophages.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Créer un Graphical Abstract a pris une demi-journée&lt;/li&gt;
&lt;li&gt;Le reviewer demande : « Veuillez refaire la Figure 3 » – c'est le désespoir&lt;/li&gt;
&lt;li&gt;Pas le temps d'apprendre Illustrator ou BioRender&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cela vous parle ?&lt;/p&gt;

&lt;p&gt;Grâce aux progrès de l'IA générative, il est désormais possible de &lt;strong&gt;générer automatiquement des figures scientifiques de qualité publication, simplement à partir d'instructions textuelles&lt;/strong&gt;. Dans cet article, je présente le fonctionnement et un workflow concret.&lt;/p&gt;

&lt;h2&gt;
  
  
  Les limites des outils traditionnels
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Outil&lt;/th&gt;
&lt;th&gt;Avantages&lt;/th&gt;
&lt;th&gt;Inconvénients&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Adobe Illustrator&lt;/td&gt;
&lt;td&gt;Grande liberté créative&lt;/td&gt;
&lt;td&gt;Courbe d'apprentissage élevée, abonnement mensuel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BioRender&lt;/td&gt;
&lt;td&gt;Nombreux modèles&lt;/td&gt;
&lt;td&gt;À partir de 39 $/mois, personnalisation limitée&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PowerPoint&lt;/td&gt;
&lt;td&gt;Simple d'utilisation&lt;/td&gt;
&lt;td&gt;Qualité insuffisante pour une publication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;matplotlib / R&lt;/td&gt;
&lt;td&gt;Reproductible par code&lt;/td&gt;
&lt;td&gt;Design peu esthétique, long à réaliser&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Tous ces outils exigent soit des &lt;strong&gt;compétences en design&lt;/strong&gt;, soit &lt;strong&gt;beaucoup de temps&lt;/strong&gt; – souvent les deux.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comment fonctionne la génération de figures par IA
&lt;/h2&gt;

&lt;p&gt;L'architecture de base est la suivante :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Entrée utilisateur (texte / données)
        ↓
  LLM (conception du layout · décomposition des éléments)
        ↓
  Modèle de génération d'images (rendu)
        ↓
  Post-traitement (ajustement du style · placement des labels)
        ↓
  Sortie (PNG / SVG / PDF)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Le point clé est un &lt;strong&gt;pipeline en deux étapes&lt;/strong&gt; : le LLM comprend d'abord la « structure » de la figure, puis le modèle de génération d'images se charge du « dessin ». Cela permet de maintenir la précision scientifique tout en produisant un design soigné.&lt;/p&gt;

&lt;h2&gt;
  
  
  En pratique : créer des figures scientifiques avec l'IA
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Méthode 1 : Génération manuelle par prompt engineering
&lt;/h3&gt;

&lt;p&gt;Donner des instructions directement à un LLM multimodal comme GPT-4o ou Claude :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Veuillez créer un Graphical Abstract avec le contenu suivant :
- Sujet de recherche : Prédiction de structure protéique par deep learning
- Gauche : Données d'entrée (séquence d'acides aminés)
- Centre : Traitement par réseau neuronal
- Droite : Sortie (structure 3D)
- Style : Design épuré façon Cell / Nature
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problème&lt;/strong&gt; : Il faut ajuster finement le prompt à chaque fois, et la qualité est irrégulière. De plus, spécifier à chaque fois les formats adaptés à la publication (résolution, police, palette de couleurs) est fastidieux.&lt;/p&gt;

&lt;h3&gt;
  
  
  Méthode 2 : Utiliser un outil IA spécialisé
&lt;/h3&gt;

&lt;p&gt;Un outil IA dédié aux figures scientifiques résout ces problèmes. &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;&lt;strong&gt;SciDraw AI&lt;/strong&gt;&lt;/a&gt; est un service IA optimisé pour la création de figures d'articles scientifiques.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caractéristiques principales :&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📝 Qualité publication à partir de simples instructions textuelles&lt;/li&gt;
&lt;li&gt;🎨 Graphical Abstracts, diagrammes de flux expérimentaux, schémas conceptuels, visualisation de données&lt;/li&gt;
&lt;li&gt;📐 Application automatique des standards de publication (≥300 dpi, tailles de police adaptées)&lt;/li&gt;
&lt;li&gt;🔄 Modifications et ajustements possibles après génération&lt;/li&gt;
&lt;li&gt;📥 Export en PNG, SVG ou PDF&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Utilisation simple :&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Accédez à &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;sci-draw.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Décrivez la figure souhaitée en texte (le français fonctionne)&lt;/li&gt;
&lt;li&gt;L'IA génère la figure&lt;/li&gt;
&lt;li&gt;Ajoutez des instructions de modification si nécessaire&lt;/li&gt;
&lt;li&gt;Téléchargez la figure finalisée&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Cas d'utilisation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Graphical Abstract
&lt;/h3&gt;

&lt;p&gt;Lors de la soumission à une revue, un Graphical Abstract résumant la recherche en une seule image est souvent exigé. Avec SciDraw AI, il suffit de saisir le résumé de l'article pour générer un Graphical Abstract avec une mise en page adaptée.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Diagramme de workflow expérimental
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Exemple : « Veuillez créer un diagramme de la procédure expérimentale
de clonage génique par PCR.
Étapes : Extraction d'ADN → Conception des amorces → Amplification PCR → 
Électrophorèse sur gel → Ligation → Transformation »
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Schémas conceptuels et diagrammes de mécanismes
&lt;/h3&gt;

&lt;p&gt;Les mécanismes biologiques complexes ou les schémas de systèmes d'ingénierie peuvent également être générés à partir d'une description textuelle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Points d'attention pour l'utilisation de figures IA dans les publications
&lt;/h2&gt;

&lt;p&gt;Lors de l'utilisation de figures générées par IA dans un article scientifique, veuillez noter les points suivants :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Vérifier la politique de la revue&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Politiques d'utilisation de l'IA des principales revues :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nature&lt;/strong&gt; : Utilisation autorisée si mentionnée dans les Methods (pas de crédit d'auteur pour l'IA)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Science&lt;/strong&gt; : Divulgation également requise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IEEE&lt;/strong&gt; : Recommande de divulguer l'utilisation d'outils assistés par IA&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Vérifier la précision scientifique&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Les figures générées par IA doivent toujours être vérifiées par le chercheur lui-même. La précision des formules structurelles et des données numériques relève de la responsabilité humaine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Droits d'auteur et originalité&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Les figures générées par IA sont généralement considérées comme du contenu original, mais veuillez respecter les directives de la revue concernée.&lt;/p&gt;

&lt;h2&gt;
  
  
  Résumé
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Méthode traditionnelle&lt;/th&gt;
&lt;th&gt;Génération par IA&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Temps de création&lt;/td&gt;
&lt;td&gt;Heures à jours&lt;/td&gt;
&lt;td&gt;Minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compétences en design&lt;/td&gt;
&lt;td&gt;Nécessaires&lt;/td&gt;
&lt;td&gt;Non requises&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cohérence de qualité&lt;/td&gt;
&lt;td&gt;Variable selon la personne&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Facilité de correction&lt;/td&gt;
&lt;td&gt;Travail manuel&lt;/td&gt;
&lt;td&gt;Correction instantanée par instruction textuelle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coût&lt;/td&gt;
&lt;td&gt;Illustrator à partir de 22 $/mois / BioRender à partir de 39 $/mois&lt;/td&gt;
&lt;td&gt;Crédits gratuits disponibles&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Le temps des chercheurs devrait être consacré à la &lt;strong&gt;recherche elle-même&lt;/strong&gt; – pas au design de figures. Utilisez les outils IA pour optimiser votre workflow de publication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Liens utiles
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;SciDraw AI – Outil IA pour figures scientifiques&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;PatentFig AI – Outil IA pour dessins de brevets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;Data2Paper – Génération automatique d'articles à partir de données&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>【2026】Wissenschaftliche Abbildungen mit KI automatisch erstellen – Illustrator war gestern</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Tue, 31 Mar 2026 14:53:17 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/2026-wissenschaftliche-abbildungen-mit-ki-automatisch-erstellen-illustrator-war-gestern-14pe</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/2026-wissenschaftliche-abbildungen-mit-ki-automatisch-erstellen-illustrator-war-gestern-14pe</guid>
      <description>&lt;h2&gt;
  
  
  Einleitung
&lt;/h2&gt;

&lt;p&gt;Für Forschende, die wissenschaftliche Paper schreiben, ist die &lt;strong&gt;Erstellung von Abbildungen (Figures)&lt;/strong&gt; eine der zeitaufwendigsten Aufgaben.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ein Graphical Abstract hat einen halben Tag gedauert&lt;/li&gt;
&lt;li&gt;Der Reviewer schreibt: „Bitte erstellen Sie Figure 3 neu" – Verzweiflung&lt;/li&gt;
&lt;li&gt;Keine Zeit, Illustrator oder BioRender zu lernen&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kommt Ihnen das bekannt vor?&lt;/p&gt;

&lt;p&gt;Dank der rasanten Entwicklung generativer KI ist es inzwischen möglich, &lt;strong&gt;allein durch Textanweisungen wissenschaftliche Abbildungen auf Publikationsniveau automatisch zu generieren&lt;/strong&gt;. In diesem Artikel stelle ich die Funktionsweise und einen konkreten Workflow vor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Grenzen herkömmlicher Tools
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Vorteile&lt;/th&gt;
&lt;th&gt;Nachteile&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Adobe Illustrator&lt;/td&gt;
&lt;td&gt;Hohe Gestaltungsfreiheit&lt;/td&gt;
&lt;td&gt;Steile Lernkurve, monatliche Kosten&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BioRender&lt;/td&gt;
&lt;td&gt;Viele Vorlagen&lt;/td&gt;
&lt;td&gt;Ab $39/Monat, eingeschränkte Anpassung&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PowerPoint&lt;/td&gt;
&lt;td&gt;Einfach zu bedienen&lt;/td&gt;
&lt;td&gt;Nicht ausreichend für Publikationsqualität&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;matplotlib / R&lt;/td&gt;
&lt;td&gt;Reproduzierbar per Code&lt;/td&gt;
&lt;td&gt;Geringes Designniveau, zeitaufwendig&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Alle diese Tools erfordern entweder &lt;strong&gt;Designkenntnisse oder viel Zeit&lt;/strong&gt; – oft beides.&lt;/p&gt;

&lt;h2&gt;
  
  
  So funktioniert KI-basierte Abbildungserstellung
&lt;/h2&gt;

&lt;p&gt;Die grundlegende Architektur sieht folgendermaßen aus:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Benutzereingabe (Text / Daten)
        ↓
  LLM (Layout-Planung · Elementzerlegung)
        ↓
  Bildgenerierungsmodell (Rendering)
        ↓
  Nachbearbeitung (Stilanpassung · Beschriftung)
        ↓
  Ausgabe (PNG / SVG / PDF)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Der Schlüssel ist eine &lt;strong&gt;zweistufige Pipeline&lt;/strong&gt;: Das LLM versteht zunächst die „Struktur" der Abbildung, dann übernimmt das Bildgenerierungsmodell das „Zeichnen". So bleibt die wissenschaftliche Genauigkeit erhalten, während ein ansprechendes Design entsteht.&lt;/p&gt;

&lt;h2&gt;
  
  
  Praxis: Wissenschaftliche Abbildungen mit KI erstellen
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Methode 1: Manuell per Prompt Engineering
&lt;/h3&gt;

&lt;p&gt;Direkte Anweisung an multimodale LLMs wie GPT-4o oder Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Bitte erstellen Sie ein Graphical Abstract mit folgendem Inhalt:
- Forschungsthema: Deep Learning für Proteinstrukturvorhersage
- Links: Eingabedaten (Aminosäuresequenz)
- Mitte: Neuronales Netzwerk
- Rechts: Ausgabe (3D-Struktur)
- Stil: Sauberes Design im Stil von Cell / Nature
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: Der Prompt muss jedes Mal feinabgestimmt werden, und die Qualität ist inkonsistent. Außerdem ist es mühsam, jedes Mal publikationsgerechte Formate (Auflösung, Schriftart, Farbschema) anzugeben.&lt;/p&gt;

&lt;h3&gt;
  
  
  Methode 2: Spezialisierte KI-Tools nutzen
&lt;/h3&gt;

&lt;p&gt;Mit einem KI-Tool, das auf wissenschaftliche Abbildungen spezialisiert ist, lassen sich diese Probleme lösen. &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;&lt;strong&gt;SciDraw AI&lt;/strong&gt;&lt;/a&gt; ist ein KI-Service, der für die Erstellung wissenschaftlicher Abbildungen optimiert wurde.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hauptmerkmale:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📝 Publikationsqualität allein durch Textanweisungen&lt;/li&gt;
&lt;li&gt;🎨 Graphical Abstracts, Experiment-Flowcharts, Konzeptdiagramme, Datenvisualisierung&lt;/li&gt;
&lt;li&gt;📐 Automatische Einhaltung von Publikationsstandards (≥300 dpi, passende Schriftgrößen)&lt;/li&gt;
&lt;li&gt;🔄 Nachträgliche Korrekturen und Feinanpassungen möglich&lt;/li&gt;
&lt;li&gt;📥 Export als PNG, SVG oder PDF&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;So einfach geht's:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;sci-draw.com&lt;/a&gt; aufrufen&lt;/li&gt;
&lt;li&gt;Gewünschte Abbildung als Text beschreiben (Deutsch funktioniert)&lt;/li&gt;
&lt;li&gt;Die KI generiert die Abbildung&lt;/li&gt;
&lt;li&gt;Bei Bedarf Änderungsanweisungen ergänzen&lt;/li&gt;
&lt;li&gt;Fertige Abbildung herunterladen&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Anwendungsbeispiele
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Graphical Abstract
&lt;/h3&gt;

&lt;p&gt;Bei der Einreichung in Fachzeitschriften wird oft ein Graphical Abstract verlangt, das den Forschungsinhalt in einer Abbildung zusammenfasst. Mit SciDraw AI genügt die Eingabe des Abstracts, um ein passendes Layout zu generieren.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Experiment-Workflow-Diagramm
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Beispiel: „Bitte erstellen Sie ein Diagramm des Versuchsablaufs
für Genklonierung mittels PCR.
Schritte: DNA-Extraktion → Primer-Design → PCR-Amplifikation → 
Gelelektrophorese → Ligation → Transformation"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Konzept- und Mechanismusdiagramme
&lt;/h3&gt;

&lt;p&gt;Auch komplexe biologische Mechanismen oder technische Systemkonzepte können per Textbeschreibung generiert werden.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hinweise zur Nutzung von KI-Abbildungen in Publikationen
&lt;/h2&gt;

&lt;p&gt;Bei der Verwendung KI-generierter Abbildungen in wissenschaftlichen Arbeiten ist Folgendes zu beachten:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Journal-Richtlinien prüfen&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;KI-Nutzungsrichtlinien wichtiger Zeitschriften:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nature&lt;/strong&gt;: Nutzung erlaubt, wenn in den Methods angegeben (keine Autorenschaft für KI)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Science&lt;/strong&gt;: Offenlegung ebenfalls erforderlich&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IEEE&lt;/strong&gt;: Empfiehlt die Offenlegung von KI-gestützten Tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Wissenschaftliche Genauigkeit prüfen&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;KI-generierte Abbildungen müssen immer von den Forschenden selbst auf inhaltliche Richtigkeit geprüft werden. Die Korrektheit von Strukturformeln und Zahlenwerten liegt in menschlicher Verantwortung.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Urheberrecht und Originalität&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;KI-generierte Abbildungen gelten grundsätzlich als Originalinhalte, aber bitte beachten Sie die jeweiligen Zeitschriftenrichtlinien.&lt;/p&gt;

&lt;h2&gt;
  
  
  Zusammenfassung
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspekt&lt;/th&gt;
&lt;th&gt;Herkömmlich&lt;/th&gt;
&lt;th&gt;KI-Generierung&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Erstellungszeit&lt;/td&gt;
&lt;td&gt;Stunden bis Tage&lt;/td&gt;
&lt;td&gt;Minuten&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Designkenntnisse&lt;/td&gt;
&lt;td&gt;Erforderlich&lt;/td&gt;
&lt;td&gt;Nicht nötig&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qualitätskonsistenz&lt;/td&gt;
&lt;td&gt;Personenabhängig&lt;/td&gt;
&lt;td&gt;Stabil&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Korrekturaufwand&lt;/td&gt;
&lt;td&gt;Manuell&lt;/td&gt;
&lt;td&gt;Per Textanweisung sofort&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kosten&lt;/td&gt;
&lt;td&gt;Illustrator ab $22/Mon. / BioRender ab $39/Mon.&lt;/td&gt;
&lt;td&gt;Kostenlose Credits verfügbar&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Die Zeit von Forschenden sollte für die &lt;strong&gt;Forschung selbst&lt;/strong&gt; genutzt werden – nicht für das Designen von Abbildungen. Nutzen Sie KI-Tools, um Ihren Publikations-Workflow zu optimieren.&lt;/p&gt;

&lt;h2&gt;
  
  
  Weiterführende Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;SciDraw AI – KI-Tool für wissenschaftliche Abbildungen&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;PatentFig AI – KI-Tool für Patentzeichnungen&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;Data2Paper – Automatische Paper-Generierung aus Daten&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>I Spent Two Hours Rotoscoping a Dance Video. Then an AI Did It in Two Minutes.</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sun, 29 Mar 2026 04:52:08 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/i-spent-two-hours-rotoscoping-a-dance-video-then-an-ai-did-it-in-two-minutes-1imj</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/i-spent-two-hours-rotoscoping-a-dance-video-then-an-ai-did-it-in-two-minutes-1imj</guid>
      <description>&lt;h1&gt;
  
  
  I Spent Two Hours Rotoscoping a Dance Video. Then an AI Did It in Two Minutes.
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fcover_en.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fcover_en.png" alt="Cover" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last Wednesday night, I had a simple task: extract a dancer from a video and put her on a clean background.&lt;/p&gt;

&lt;p&gt;Simple, right?&lt;/p&gt;

&lt;p&gt;I opened Premiere Pro. Fired up the Roto Brush. Two hours later, the hair was a smeared mess, the skirt edges looked like they'd been cut with safety scissors, and I was questioning my career choices.&lt;/p&gt;

&lt;p&gt;Then I tried an online matting tool. Uploaded the video, waited five minutes, and got back something that flickered like a strobe light — the extraction boundary jittered on every single frame.&lt;/p&gt;

&lt;p&gt;At 1 AM, frustrated and caffeinated, I stumbled on a GitHub repo called &lt;strong&gt;MatAnyone2&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two minutes later, I had my jaw on the floor.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is MatAnyone2?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fteaser.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fteaser.jpg" alt="MatAnyone2 Results" width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MatAnyone2 is a &lt;strong&gt;video matting framework&lt;/strong&gt; developed by researchers at S-Lab (Nanyang Technological University) and SenseTime Research. It was just accepted to &lt;strong&gt;CVPR 2026&lt;/strong&gt; — the top conference in computer vision.&lt;/p&gt;

&lt;p&gt;What it does: takes a regular video — no green screen, no special lighting — and extracts people with &lt;strong&gt;pixel-perfect alpha mattes&lt;/strong&gt;. That means hair strands, translucent fabrics, wispy edges — all preserved with precise transparency values.&lt;/p&gt;

&lt;p&gt;This isn't binary segmentation (person = 1, background = 0). This is real matting. Every pixel gets a transparency value between 0 and 1. The difference matters enormously when you composite onto a new background.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works (The Interesting Part)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fmatanyone1vs2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fmatanyone1vs2.jpg" alt="MatAnyone 1 vs 2 Comparison" width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core innovation is something called the &lt;strong&gt;Matting Quality Evaluator (MQE)&lt;/strong&gt; — essentially, the model has its own built-in quality inspector.&lt;/p&gt;

&lt;p&gt;Here's the problem it solves: traditional matting models train on synthetic data. You take a foreground, paste it on a background, and the model learns to undo that composition. But synthetic data is too clean. Real-world videos have wind-blown hair, changing lighting, motion blur, complex occlusions. Models trained purely on synthetic data choke on these.&lt;/p&gt;

&lt;p&gt;MatAnyone2's approach is clever. The MQE generates a pixel-level quality map for each matte — marking which regions are reliable and which are garbage. During training, the model only learns from the reliable pixels. Bad predictions get suppressed instead of reinforcing mistakes.&lt;/p&gt;

&lt;p&gt;Using this mechanism, the team built &lt;strong&gt;VMReal&lt;/strong&gt;: a dataset of &lt;strong&gt;28,000 real-world video clips and 2.4 million frames&lt;/strong&gt;, each annotated with quality evaluation maps. That's why it works so well on real footage — it was trained on real footage.&lt;/p&gt;

&lt;h2&gt;
  
  
  My First Run
&lt;/h2&gt;

&lt;p&gt;The workflow is dead simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Upload your video&lt;/li&gt;
&lt;li&gt;Click a few points on the first frame to mark your subject (SAM handles the mask generation)&lt;/li&gt;
&lt;li&gt;Hit "Video Matting"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fteaser_demo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fteaser_demo.gif" alt="Interactive Demo" width="560" height="345"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On my RTX 3080, that dance video processed in about two minutes.&lt;/p&gt;

&lt;p&gt;I opened the alpha channel output and just stared at it. Individual hair strands. The gap between fingers. The semi-transparent edge of a flowing skirt. All clean. All temporally consistent — no flickering between frames.&lt;/p&gt;

&lt;p&gt;Those two hours I spent with Roto Brush suddenly felt very personal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real Results
&lt;/h2&gt;

&lt;p&gt;Here are some test samples to give you a feel for the extraction quality:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-0.jpg" alt="Sample 1" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-1.jpg" alt="Sample 2" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Ftest-sample-2.jpg" alt="Sample 3" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Look at the hair boundaries. Look at the semi-transparent regions. This isn't a hard cutout — it's a proper alpha matte with continuous transparency values. When you composite these onto a new background, there's no "sticker effect."&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Person Support
&lt;/h2&gt;


  


&lt;p&gt;You can mark multiple people in the same video and extract them separately. For anyone doing VFX compositing, this is a game-changer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Data Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fdata_pipeline.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fmatanyone2_1774754566%2Fdata_pipeline.jpg" alt="Data Pipeline" width="800" height="183"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What I find particularly elegant is how the MQE doubles as a data curator. Multiple matting models process the same video. The MQE evaluates each result, picks the best regions from each, and stitches them into a higher-quality composite annotation.&lt;/p&gt;

&lt;p&gt;This means annotation quality improves as more models and data are added. It's not a static tool — it's a system that gets better over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Hardware Requirements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;NVIDIA GPU (8GB+ VRAM recommended)&lt;/li&gt;
&lt;li&gt;CUDA support&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Command Line (Fastest)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python inference_matanyone2.py &lt;span class="nt"&gt;-i&lt;/span&gt; your_video.mp4 &lt;span class="nt"&gt;-m&lt;/span&gt; your_mask.png &lt;span class="nt"&gt;-o&lt;/span&gt; results/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Feed it a video and a first-frame mask. Out comes a foreground video (green screen composite) and an alpha matte video.&lt;/p&gt;

&lt;h3&gt;
  
  
  Interactive GUI (Recommended for First-Timers)
&lt;/h3&gt;

&lt;p&gt;Launch the Gradio interface and everything is point-and-click. SAM is built in, so you don't need to prepare masks in advance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python API (For Integration)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;matanyone2&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MatAnyone2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;InferenceCore&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MatAnyone2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PeiqingYang/MatAnyone2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;processor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InferenceCore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cuda:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;input_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_video.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mask_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_mask.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three lines. Drop it into your existing pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Compares
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Hair Detail&lt;/th&gt;
&lt;th&gt;Temporal Consistency&lt;/th&gt;
&lt;th&gt;Transparency&lt;/th&gt;
&lt;th&gt;Green Screen Required&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Premiere Roto Brush&lt;/td&gt;
&lt;td&gt;Manual labor&lt;/td&gt;
&lt;td&gt;Decent&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Online Matting Tools&lt;/td&gt;
&lt;td&gt;Average&lt;/td&gt;
&lt;td&gt;Poor (flickers)&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Traditional Green Screen&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MatAnyone2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Excellent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Excellent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Excellent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;I've been doing video post-production long enough to be skeptical of anything that promises "one-click" results. Most of them look great in the demo reel and fall apart on real footage.&lt;/p&gt;

&lt;p&gt;MatAnyone2 is different. It's not approximate segmentation dressed up as matting. It's genuine pixel-level alpha estimation, trained on 2.4 million frames of real-world video, with a built-in quality evaluator that ensures the model only learns from its best work.&lt;/p&gt;

&lt;p&gt;If you do short-form content, film post-production, virtual streaming, or just want to swap the background on a home video — give this a try. It might change how you think about video extraction entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/pq-yang/MatAnyone2" rel="noopener noreferrer"&gt;https://github.com/pq-yang/MatAnyone2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live Demo&lt;/strong&gt;: &lt;a href="https://huggingface.co/spaces/PeiqingYang/MatAnyone2" rel="noopener noreferrer"&gt;https://huggingface.co/spaces/PeiqingYang/MatAnyone2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One-Click Deploy Package&lt;/strong&gt;: &lt;a href="https://www.patreon.com/posts/154208684" rel="noopener noreferrer"&gt;https://www.patreon.com/posts/154208684&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Automating Clinical Data Analysis: The Pipeline From Hospital Exports to Paper Drafts</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sat, 28 Mar 2026 09:31:16 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/automating-clinical-data-analysis-the-pipeline-from-hospital-exports-to-paper-drafts-phh</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/automating-clinical-data-analysis-the-pipeline-from-hospital-exports-to-paper-drafts-phh</guid>
      <description>&lt;h1&gt;
  
  
  Automating Clinical Data Analysis: The Pipeline From Hospital Exports to Paper Drafts
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6fhvxgsv5rewpc3rzo1m.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6fhvxgsv5rewpc3rzo1m.jpg" alt="Cover" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've been building &lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;Data2Paper&lt;/a&gt; — a tool that turns research data into complete paper drafts. The latest challenge: handling clinical datasets from hospital systems.&lt;/p&gt;

&lt;p&gt;If you've never worked with hospital data exports, here's what makes them... fun.&lt;/p&gt;

&lt;h2&gt;
  
  
  The input problem
&lt;/h2&gt;

&lt;p&gt;A typical clinical data export looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PatientID | Age | Sex | HbA1c | SBP | DBP | eGFR | Dx | AdmDate | DisDate | Status
001       | 67  | M   | 8.2   | 145 | 92  |      | T2DM | 2024-01-15 | 01/25/2024 | alive
002       | 54  | F   |       | 128 | 78  | 85   | 2型糖尿病 | 20240203 | 2024-02-10 | 
003       | -5  | M   | 7.1   | 300 | 85  | 92   | type 2 DM | 2024-03-01 | 2024-03-08 | dead
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice: three different date formats in the same column, the same diagnosis coded three different ways, an obviously wrong age, a systolic BP that's probably a data entry error, missing values that could mean "not tested" or "not recorded," and mixed languages.&lt;/p&gt;

&lt;p&gt;This is normal. Every clinical researcher I've talked to confirms: this is what the export looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  The analysis pipeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw export (CSV/XLSX)
│
├─ Structure detection
│   └─ row = patient? visit? wide? long?
│
├─ Data cleaning
│   ├─ Date format standardization
│   ├─ Coding unification ("T2DM" = "2型糖尿病" = "type 2 DM")
│   ├─ Outlier flagging (SBP=300, Age=-5)
│   └─ Missing value classification (not tested vs not recorded)
│
├─ Variable typing
│   ├─ Continuous (age, HbA1c, eGFR)
│   ├─ Categorical (sex, diagnosis, comorbidities)
│   └─ Time-to-event (survival time + censoring status)
│
├─ Statistical analysis (Python execution)
│   ├─ Baseline table with per-variable test selection
│   ├─ Regression (logistic / Cox / linear / Poisson)
│   ├─ Survival analysis (KM + log-rank)
│   └─ Diagnostic evaluation (ROC + AUC)
│
└─ Output generation
    ├─ Formatted tables (baseline, regression results)
    ├─ Figures (KM curves, ROC curves, forest plots)
    └─ Manuscript sections (methods + results)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key technical decisions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python execution, not LLM computation.&lt;/strong&gt; Statistics must be verifiable. The LLM writes the interpretation; &lt;code&gt;scipy&lt;/code&gt;, &lt;code&gt;statsmodels&lt;/code&gt;, and &lt;code&gt;lifelines&lt;/code&gt; compute the numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clinical variable lookup.&lt;/strong&gt; Recognizing "SBP" as systolic blood pressure enables domain-aware outlier detection (flag 300 mmHg as likely error) rather than purely statistical outlier methods.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assumption checking.&lt;/strong&gt; Every statistical test includes prerequisite verification — normality for parametric tests, events-per-variable for logistic regression, proportional hazards for Cox. Running analysis without assumption checks is the #1 reason clinical papers get sent back by reviewers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The baseline table problem
&lt;/h2&gt;

&lt;p&gt;Generating Table 1 (baseline characteristics) sounds simple but requires per-variable logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;is_categorical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# n (%), chi-square or Fisher's exact
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;is_normal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# mean ± SD, t-test or ANOVA
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;is_skewed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# median (IQR), Mann-Whitney or Kruskal-Wallis
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tricky part is automating the normality decision and handling the edge cases (small cell counts triggering Fisher's instead of chi-square, for instance).&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Next.js + Vercel&lt;/li&gt;
&lt;li&gt;Claude API for text generation&lt;/li&gt;
&lt;li&gt;Python chain for statistical computation&lt;/li&gt;
&lt;li&gt;Export: PDF / DOCX / LaTeX / ZIP&lt;/li&gt;
&lt;li&gt;7 output languages&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'm still figuring out
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Better heuristics for distinguishing "not tested" vs "not recorded" missing values&lt;/li&gt;
&lt;li&gt;Automated detection of wide vs long format in longitudinal datasets&lt;/li&gt;
&lt;li&gt;Handling mixed-language clinical notes in the same dataset&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've worked on similar problems — clinical data pipelines, automated statistical analysis, or structured document generation from data — I'd love to compare notes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;datatopaper.com&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>automation</category>
      <category>dataengineering</category>
      <category>datascience</category>
      <category>writing</category>
    </item>
    <item>
      <title>How to Create Medical and Science Book Illustrations With AI</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sun, 22 Mar 2026 13:17:56 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/how-to-create-medical-and-science-book-illustrations-with-ai-10m</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/how-to-create-medical-and-science-book-illustrations-with-ai-10m</guid>
      <description>&lt;h1&gt;
  
  
  How to Create Medical and Science Book Illustrations With AI
&lt;/h1&gt;

&lt;p&gt;Medical and science publishing has a very specific illustration problem.&lt;/p&gt;

&lt;p&gt;You do not just need a figure that looks good. You need one that explains clearly, survives multiple review rounds, stays consistent across chapters, and can be reused in print pages, lecture slides, LMS modules, and translated editions.&lt;/p&gt;

&lt;p&gt;That is why AI is becoming useful in this space. Not because it replaces editorial judgment, but because it speeds up the first draft and makes figure production more scalable.&lt;/p&gt;

&lt;p&gt;In this article, I will walk through a practical workflow for creating medical book illustrations, science book figures, and textbook diagrams with AI, while keeping the output usable for real publishing work.&lt;/p&gt;

&lt;p&gt;If you want a tool built specifically for this workflow, visit &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;SciDraw&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fafmnqehksb8jcbnzh9vu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fafmnqehksb8jcbnzh9vu.png" alt="Cover image" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Original image link: &lt;a href="https://cdn.xueshu.fun/202603201935059.png" rel="noopener noreferrer"&gt;https://cdn.xueshu.fun/202603201935059.png&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Textbook Illustrations Need a Different Workflow
&lt;/h2&gt;

&lt;p&gt;A figure for a medical or science book has a higher bar than a generic marketing visual.&lt;/p&gt;

&lt;p&gt;It usually needs to satisfy five constraints at the same time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It must be consistent with other figures in the same book.&lt;/li&gt;
&lt;li&gt;It must be easy to edit after author, editor, or reviewer feedback.&lt;/li&gt;
&lt;li&gt;It must work across print, presentation, and digital teaching formats.&lt;/li&gt;
&lt;li&gt;It must support localization for future translated editions.&lt;/li&gt;
&lt;li&gt;It must prioritize scientific clarity over decoration.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This changes the goal completely.&lt;/p&gt;

&lt;p&gt;The goal is not to generate a beautiful one-off image. The goal is to build a figure system that is accurate, reusable, and inexpensive to revise.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5zinoebkz8uprtqe8oa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5zinoebkz8uprtqe8oa.png" alt="Book illustration workflow overview" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Original image link: &lt;a href="https://cdn.xueshu.fun/202603201938377.png" rel="noopener noreferrer"&gt;https://cdn.xueshu.fun/202603201938377.png&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Five Illustration Types That Appear Again and Again
&lt;/h2&gt;

&lt;p&gt;In most medical and science book projects, the same visual patterns keep coming back. Once you recognize them, prompting becomes much easier.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Mechanism Diagrams
&lt;/h3&gt;

&lt;p&gt;These explain how something works, such as immune pathways, signaling cascades, drug mechanisms, or physiological feedback loops.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Anatomy and Structure Figures
&lt;/h3&gt;

&lt;p&gt;These focus on labeled structures, including organs, tissue layers, anatomical landmarks, and system overviews.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Process and Workflow Figures
&lt;/h3&gt;

&lt;p&gt;These help readers follow a sequence, such as a diagnostic pathway, treatment algorithm, lab procedure, or experimental workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Comparison Figures
&lt;/h3&gt;

&lt;p&gt;These are useful when teaching differences, such as normal vs. diseased states, before vs. after treatment, or side-by-side techniques.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Chapter Summary Figures
&lt;/h3&gt;

&lt;p&gt;These compress an entire chapter into one visual and help readers retain the main logic, sequence, or takeaways.&lt;/p&gt;

&lt;p&gt;When you classify the figure correctly before prompting, the review cycle usually gets shorter and the result is much easier to refine.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Practical AI Workflow for Book Illustrations
&lt;/h2&gt;

&lt;p&gt;Here is the workflow that tends to work best for authors, editors, and educators.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Start With the Teaching Objective
&lt;/h3&gt;

&lt;p&gt;Before writing any prompt, define the job of the figure.&lt;/p&gt;

&lt;p&gt;Ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What should the reader understand after looking at it?&lt;/li&gt;
&lt;li&gt;Is this mainly a mechanism, a structure, a process, or a comparison?&lt;/li&gt;
&lt;li&gt;What absolutely needs to be labeled?&lt;/li&gt;
&lt;li&gt;What should be simplified or left out?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the teaching objective is vague, the figure usually becomes visually crowded no matter how polished it looks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Prompt From Structure, Not Style
&lt;/h3&gt;

&lt;p&gt;Strong textbook prompts start with content structure instead of decorative adjectives.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a medical book illustration explaining type II hypersensitivity.
Use a horizontal educational layout with 3 numbered sections:
1. Antibody binding to cell-surface antigen
2. Effector activation (complement / Fc receptor mediated response)
3. Target cell damage

Use clean textbook styling, white background, blue-teal-red palette,
clear arrows, concise English labels, and publication-ready hierarchy.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works because it defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the learning goal&lt;/li&gt;
&lt;li&gt;the layout&lt;/li&gt;
&lt;li&gt;the sequence&lt;/li&gt;
&lt;li&gt;the labeling logic&lt;/li&gt;
&lt;li&gt;the general visual direction&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Generate the First Draft Quickly
&lt;/h3&gt;

&lt;p&gt;At this stage, speed matters more than perfection.&lt;/p&gt;

&lt;p&gt;The first draft only needs to answer four questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the structure right?&lt;/li&gt;
&lt;li&gt;Are the labels in the right general positions?&lt;/li&gt;
&lt;li&gt;Does the flow make sense?&lt;/li&gt;
&lt;li&gt;Is the density appropriate for the chapter?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of the first output as editorial scaffolding, not final artwork.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Edit for Publishing Logic
&lt;/h3&gt;

&lt;p&gt;This is where the real quality comes from.&lt;/p&gt;

&lt;p&gt;Refine the draft for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;terminology&lt;/li&gt;
&lt;li&gt;label order&lt;/li&gt;
&lt;li&gt;arrow direction&lt;/li&gt;
&lt;li&gt;color meaning&lt;/li&gt;
&lt;li&gt;spacing&lt;/li&gt;
&lt;li&gt;caption compatibility&lt;/li&gt;
&lt;li&gt;visual hierarchy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI gets you to a strong draft faster. Editorial work makes it publishable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faoikiar6nryu312a73t0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faoikiar6nryu312a73t0.png" alt="Medical mechanism book illustration example" width="800" height="597"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Original image link: &lt;a href="https://cdn.xueshu.fun/202603201939133.png" rel="noopener noreferrer"&gt;https://cdn.xueshu.fun/202603201939133.png&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Reuse the Base Figure Across Formats
&lt;/h3&gt;

&lt;p&gt;This is where the time savings compound.&lt;/p&gt;

&lt;p&gt;A good book illustration should be reusable in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;print chapters&lt;/li&gt;
&lt;li&gt;lecture slides&lt;/li&gt;
&lt;li&gt;online teaching modules&lt;/li&gt;
&lt;li&gt;instructor guides&lt;/li&gt;
&lt;li&gt;translated editions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If every figure is built as a dead-end asset, the production cost stays high. If figures are built as reusable teaching components, the workflow becomes much more efficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt Templates You Can Use Immediately
&lt;/h2&gt;

&lt;p&gt;Here are a few prompt patterns that work well for common textbook illustration tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Medical Mechanism Figure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a medical book illustration for [topic].
Target audience: [undergraduate / graduate / professional training].
Use a [horizontal / vertical] textbook layout with [number] sections.
Show [key actors] and [key events] in logical sequence.
Include concise English labels, arrows for causal flow, and a clean
white background. Use a professional educational style with strong
visual hierarchy and publication-ready clarity.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Anatomy Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create an anatomy diagram for a medical textbook.
Topic: [organ / system / structure].
Show the major labeled regions only, not every fine detail.
Use a clean educational style, legible English labels, subtle color
coding, and a balanced layout suitable for print and lecture slides.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Comparison Figure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a comparison illustration for a science book.
Compare [condition A] vs [condition B].
Use a two-column layout with matched scale, mirrored organization,
and clear difference callouts. Keep labels concise and make the
visual contrast obvious without clutter.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Workflow or Decision Pathway
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a workflow figure for a medical or science textbook.
Topic: [diagnostic pathway / treatment algorithm / lab process].
Use numbered steps, directional arrows, short labels, and a clear
start-to-end reading path. Make it easy to reuse in both print and
presentation formats.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How to Keep a Whole Book Visually Consistent
&lt;/h2&gt;

&lt;p&gt;One of the biggest mistakes in book production is treating every figure as a separate art project.&lt;/p&gt;

&lt;p&gt;A better approach is to define a visual system at the beginning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one core color palette&lt;/li&gt;
&lt;li&gt;one label style&lt;/li&gt;
&lt;li&gt;one arrow style&lt;/li&gt;
&lt;li&gt;one spacing rule&lt;/li&gt;
&lt;li&gt;one callout pattern&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then reuse those rules in every prompt.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use the same visual system as previous chapter figures:
white background, teal primary structures, orange emphasis,
dark gray labels, rounded panel boxes, thin directional arrows,
minimal shadows, publication-ready textbook style.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single paragraph can save hours of revision over the course of a full book.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6qae26eo5vsmokr0wqqs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6qae26eo5vsmokr0wqqs.png" alt="Reuse across print, slides, and digital courseware" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Original image link: &lt;a href="https://cdn.xueshu.fun/202603201940704.png" rel="noopener noreferrer"&gt;https://cdn.xueshu.fun/202603201940704.png&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A Simple Quality Checklist Before Finalizing a Figure
&lt;/h2&gt;

&lt;p&gt;Before approving a figure for publication, check the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Are labels short enough to survive translation later?&lt;/li&gt;
&lt;li&gt;Is the figure still readable when reduced on a printed page?&lt;/li&gt;
&lt;li&gt;Can the same composition work in slides or LMS layouts?&lt;/li&gt;
&lt;li&gt;Are colors supporting explanation instead of acting as decoration?&lt;/li&gt;
&lt;li&gt;Does each panel communicate one clear teaching point?&lt;/li&gt;
&lt;li&gt;Can an editor or co-author revise it without rebuilding everything?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the answer is yes, the figure is doing real publishing work, not just visual work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Takeaway
&lt;/h2&gt;

&lt;p&gt;The most effective workflow for medical and science book illustrations is not "AI instead of editing."&lt;/p&gt;

&lt;p&gt;It is "AI for the first 80%, followed by a reusable editorial workflow for the last 20%."&lt;/p&gt;

&lt;p&gt;That approach gives authors and educators three concrete advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;faster figure production&lt;/li&gt;
&lt;li&gt;easier revision&lt;/li&gt;
&lt;li&gt;stronger visual consistency across the entire book&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your team is producing textbook diagrams at scale, the highest-leverage move is to build one reusable figure system and keep every new illustration inside that system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try SciDraw
&lt;/h2&gt;

&lt;p&gt;If you want to turn chapter outlines, rough sketches, and reference images into clean, reusable scientific illustrations, visit &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;SciDraw&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;SciDraw is built for scientific and medical visuals that need to work across books, slides, and digital courseware.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>I built an AI tool that turns survey data into research papers — here's the architecture</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sun, 22 Mar 2026 08:56:46 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/i-built-an-ai-tool-that-turns-survey-data-into-research-papers-heres-the-architecture-4fha</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/i-built-an-ai-tool-that-turns-survey-data-into-research-papers-heres-the-architecture-4fha</guid>
      <description>&lt;p&gt;I built an AI tool that turns survey data into research papers — here's the architecture&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9cfpq6e98uogfmdby1m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu9cfpq6e98uogfmdby1m.png" alt="Data2Paper Cover" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Hey DEV community! I'm a solo founder building AI tools for researchers. My latest product is &lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;Data2Paper&lt;/a&gt; — it takes raw survey/questionnaire export data and produces complete research paper drafts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Researchers collect survey data → export CSV → spend weeks turning it into a paper.&lt;/p&gt;

&lt;p&gt;The manual workflow looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clean the exported data (fix encoding, remove junk rows, identify the actual response sheet)&lt;/li&gt;
&lt;li&gt;Recode variables and set up analysis frameworks&lt;/li&gt;
&lt;li&gt;Run statistical tests in SPSS/R/Python&lt;/li&gt;
&lt;li&gt;Build tables and charts&lt;/li&gt;
&lt;li&gt;Write methodology, results, and discussion sections&lt;/li&gt;
&lt;li&gt;Format everything into a deliverable document&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data2Paper compresses that entire workflow into a single pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture overview
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────┐
│  Upload      │  CSV / XLSX / XLS
│  (Survey     │  from any questionnaire platform
│   Export)    │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Data        │  Identify response sheet vs summary
│  Intake      │  Parse machine headers (Q1, SC2...)
│              │  Detect variable types
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Analysis    │  Python execution chain
│  Engine      │  Statistical tests based on variable types
│              │  Generate charts &amp;amp; tables
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Paper       │  Multi-language (7 languages)
│  Generation  │  Full academic structure
│              │  Claude API
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  Export      │  PDF / Word / LaTeX / ZIP
│  &amp;amp; Delivery  │
└─────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key technical decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Python execution instead of LLM-generated stats?
&lt;/h3&gt;

&lt;p&gt;Language models can hallucinate numbers. For a research tool, that's unacceptable. The analysis engine runs actual Python code to compute statistics — correlation, regression, chi-square, ANOVA, etc. The LLM interprets the results, but doesn't generate them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why survey-specific, not generic?
&lt;/h3&gt;

&lt;p&gt;Generic "data to text" tools don't understand that row 1 might be a machine header, that columns might represent Likert scales, or that the first sheet might be a summary rather than raw data. By focusing specifically on survey exports, the system handles these patterns reliably.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why multi-language from day one?
&lt;/h3&gt;

&lt;p&gt;Research is global. A tool that only outputs English misses a huge segment of users — Chinese grad students, European consulting teams, Japanese research groups. Supporting 7 languages in the generation pipeline (not as translation) was a deliberate product decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend/Backend:&lt;/strong&gt; Next.js on Vercel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI:&lt;/strong&gt; Claude API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analysis:&lt;/strong&gt; Python execution chain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payments:&lt;/strong&gt; Stripe&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Export:&lt;/strong&gt; PDF, DOCX, LaTeX rendering&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;If you work with survey data or know someone in academia who does: &lt;a href="https://datatopaper.com" rel="noopener noreferrer"&gt;datatopaper.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd love feedback from the DEV community, especially around the analysis pipeline design and the multi-language generation approach. Drop a comment or reach out!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Check If a Scientific Figure Is Ready for Journal Submission</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Tue, 17 Mar 2026 06:38:34 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/how-to-check-if-a-scientific-figure-is-ready-for-journal-submission-4pj3</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/how-to-check-if-a-scientific-figure-is-ready-for-journal-submission-4pj3</guid>
      <description>&lt;p&gt;How to Check If a Scientific Figure Is Ready for Journal Submission&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You're about to submit a paper. The manuscript is polished. Then the journal upload system starts asking about figure resolution, format, and dimensions.&lt;/p&gt;

&lt;p&gt;Sound familiar?&lt;/p&gt;

&lt;p&gt;Most figure rejections aren't about bad science — they're about bad file hygiene. The figure &lt;em&gt;looks&lt;/em&gt; fine on your 4K monitor, but at final print width, it's blurry. Or the JPEG compression has been quietly eating your axis labels. Or your red-vs-green comparison chart is invisible to 8% of male readers.&lt;/p&gt;

&lt;p&gt;Here are the four checks every figure needs before you hit "Upload."&lt;/p&gt;




&lt;h2&gt;
  
  
  Check 1: Effective DPI ≠ File DPI
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The trap:&lt;/strong&gt; You exported at 300 DPI. You're safe, right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The reality:&lt;/strong&gt; DPI metadata means nothing without knowing the final placement width.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Image: 2400 × 1600 pixels
Exported at: 300 DPI

At single-column (85 mm / 3.35"):
  → Effective DPI = 2400 ÷ 3.35 = 716 DPI ✅

At double-column (180 mm / 7.09"):
  → Effective DPI = 2400 ÷ 7.09 = 338 DPI ✅ (barely)

At full-page (210 mm / 8.27"):
  → Effective DPI = 2400 ÷ 8.27 = 290 DPI ⚠️ (below threshold)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Always check DPI against the &lt;em&gt;actual column width&lt;/em&gt; your figure will occupy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check 2: File Format Matters
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Your figure has...&lt;/th&gt;
&lt;th&gt;Use&lt;/th&gt;
&lt;th&gt;Avoid&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Text, labels, arrows, line art&lt;/td&gt;
&lt;td&gt;TIFF, PDF, EPS, SVG&lt;/td&gt;
&lt;td&gt;JPEG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Photographs, microscopy&lt;/td&gt;
&lt;td&gt;TIFF, high-quality JPEG&lt;/td&gt;
&lt;td&gt;Low-quality JPEG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mixed content&lt;/td&gt;
&lt;td&gt;TIFF, PDF&lt;/td&gt;
&lt;td&gt;JPEG&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; JPEG compression creates artifacts around sharp edges. Every re-save makes it worse. If your figure has &lt;em&gt;any&lt;/em&gt; text or line work, JPEG is risky.&lt;/p&gt;

&lt;p&gt;Also watch out for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unexpected transparency/alpha channels (some journals can't handle them)&lt;/li&gt;
&lt;li&gt;RGB vs. CMYK color mode mismatches&lt;/li&gt;
&lt;li&gt;Files that have been re-exported multiple times (quality degrades cumulatively)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Check 3: Grayscale Readability
&lt;/h2&gt;

&lt;p&gt;Many reviewers print papers in black and white. If your figure relies entirely on color to convey information, it may become unreadable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common failures:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two data series with different colors → same gray value&lt;/li&gt;
&lt;li&gt;Heatmap gradients → flat gray blob&lt;/li&gt;
&lt;li&gt;Colored annotations → invisible against background&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Quick test:&lt;/strong&gt; Open your figure in any image editor, convert to grayscale, and check if every element is still distinguishable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check 4: Colorblind Safety
&lt;/h2&gt;

&lt;p&gt;Color vision deficiency affects ~8% of males and ~0.5% of females. The most common type makes red and green look nearly identical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High-risk patterns:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Red vs. green for different conditions&lt;/li&gt;
&lt;li&gt;Multiple saturated hues without pattern/shape backup&lt;/li&gt;
&lt;li&gt;Color as the &lt;em&gt;only&lt;/em&gt; way to distinguish data series&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Use colorblind-safe palettes, add markers or line style variations, and include direct labels where possible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Preflight Workflow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1 → Use the actual file you'll submit (not a draft)
Step 2 → Set the target layout width
Step 3 → Run all four checks
Step 4 → Keep / Re-export / Redraw
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Quick Decision Guide
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;What to do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;✅ All clear&lt;/td&gt;
&lt;td&gt;Submit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⚠️ Format or DPI warning&lt;/td&gt;
&lt;td&gt;Re-export with better settings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;❌ Grayscale or colorblind fail&lt;/td&gt;
&lt;td&gt;Adjust colors, add labels/patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;❌ Resolution too low&lt;/td&gt;
&lt;td&gt;Re-render at higher resolution or use vector format&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Submission Checklist
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Figure checked at actual final column width&lt;/li&gt;
&lt;li&gt;[ ] Effective DPI ≥ 300 at that width&lt;/li&gt;
&lt;li&gt;[ ] Format is safe for text and line work&lt;/li&gt;
&lt;li&gt;[ ] Readable in grayscale&lt;/li&gt;
&lt;li&gt;[ ] Key distinctions pass colorblind simulation&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://sci-draw.com/figure-checker" rel="noopener noreferrer"&gt;&lt;strong&gt;SciDraw Figure Checker&lt;/strong&gt;&lt;/a&gt; runs all four checks automatically. Upload a figure, set your target width, and get a preflight report.&lt;/p&gt;

&lt;p&gt;Other useful tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔄 &lt;a href="https://sci-draw.com/convert" rel="noopener noreferrer"&gt;SciDraw Converter&lt;/a&gt; — Convert between TIFF, EPS, PDF with DPI/CMYK control&lt;/li&gt;
&lt;li&gt;🎨 &lt;a href="https://sci-draw.com/ai-drawing" rel="noopener noreferrer"&gt;SciDraw AI Drawing&lt;/a&gt; — Generate scientific illustrations from text descriptions&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;What's your worst figure submission horror story? Drop it in the comments 👇&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>I Ran LTX 2.3 Locally — Image to Video with Audio, No Cloud Required</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sun, 08 Mar 2026 11:34:23 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/i-ran-ltx-23-locally-image-to-video-with-audio-no-cloud-required-30f1</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/i-ran-ltx-23-locally-image-to-video-with-audio-no-cloud-required-30f1</guid>
      <description>&lt;h1&gt;
  
  
  I Ran LTX 2.3 Locally — Image to Video with Audio, No Cloud Required
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtjtbhlwxgq9od9pujnm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdtjtbhlwxgq9od9pujnm.png" alt="Cover" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last Wednesday night, I got my 12th "content policy violation" of the month.&lt;/p&gt;

&lt;p&gt;I wasn't doing anything illegal. Just a portrait photo, a simple motion prompt. The kind of thing any filmmaker would shoot on set.&lt;/p&gt;

&lt;p&gt;The platform didn't care. The error message was the same cold boilerplate it always is.&lt;/p&gt;

&lt;p&gt;That was the moment I decided I was done with cloud video generation.&lt;/p&gt;




&lt;p&gt;Two hours later, someone dropped a link in a Discord server I'm in.&lt;/p&gt;

&lt;p&gt;"LTX 2.3 GGUF is out. Runs on consumer GPUs. Image-to-video with native audio."&lt;/p&gt;

&lt;p&gt;I stared at that message for a few seconds.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Native audio.&lt;/em&gt; Not dubbed afterward. Not a separate step. Generated alongside the video, synchronized, as one output.&lt;/p&gt;

&lt;p&gt;I closed the browser tab with the content violation error and started downloading the model.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is LTX 2.3?
&lt;/h2&gt;

&lt;p&gt;LTX-Video is an open-source video generation model from Lightricks, an Israeli company that's been in the media processing space for a while. Version 2.3 is their most capable release yet, and what makes it genuinely interesting compared to everything else out there:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It generates video and audio simultaneously.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not video first, then audio layered on top. The model jointly produces both streams — synchronized dialogue, ambient sound, environmental audio — as a single generation pass. That's architecturally different from most pipelines where audio is an afterthought.&lt;/p&gt;

&lt;p&gt;Other notable upgrades in 2.3:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redesigned VAE for sharper fine details (hair, fabric texture, edges)&lt;/li&gt;
&lt;li&gt;Significantly improved image-to-video quality&lt;/li&gt;
&lt;li&gt;4K resolution support at up to 50 FPS&lt;/li&gt;
&lt;li&gt;Better prompt understanding and camera motion control&lt;/li&gt;
&lt;li&gt;Portrait (9:16) support alongside landscape&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The base model sits at 19 billion parameters. Running it at full precision would require 38GB+ VRAM — firmly in server territory.&lt;/p&gt;

&lt;p&gt;Then GGUF happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why GGUF Changes Everything
&lt;/h2&gt;

&lt;p&gt;The short version: GGUF is a quantization format that compresses model weights from 16-bit floats down to 4-bit (or lower). Same model, roughly one-fifth the size.&lt;/p&gt;

&lt;p&gt;The version I'm using is &lt;code&gt;Q4_K_S&lt;/code&gt; — about 10.7GB for the main model file. My GPU is an RTX 3080 with 10GB VRAM. The text encoder (Gemma 3 12B) offloads to CPU/RAM. Main model runs on GPU.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result: a 5-second, 960×544 video with audio in about 2-3 minutes.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Is that fast? No. Is it running entirely on my own hardware, with no cloud, no API calls, no usage logs? Yes.&lt;/p&gt;

&lt;p&gt;That trade-off is completely worth it to me.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Output Actually Looks Like
&lt;/h2&gt;

&lt;p&gt;I ran an image-to-video test with a portrait photo. The prompt was minimal — I wanted to see what the model would do with almost no direction.&lt;/p&gt;

&lt;p&gt;Input image:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fltx23_1772945993%2Finput_image.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.xueshu.fun%2Farticles%2Fltx23_1772945993%2Finput_image.png" alt="Input image" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First output:&lt;/p&gt;


  
  Your browser doesn't support video playback.


&lt;p&gt;Second test with a different input:&lt;/p&gt;


  
  Your browser doesn't support video playback.


&lt;p&gt;Honest assessment: it's not perfect. At Q4 quantization you lose some sharpness compared to the full BF16 model. Motion can be slightly jerky on complex scenes.&lt;/p&gt;

&lt;p&gt;But the audio synchronization is genuinely impressive. And more importantly — &lt;strong&gt;this ran on my desk, with no data leaving my machine.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Privacy Argument (And Why It Actually Matters)
&lt;/h2&gt;

&lt;p&gt;Let me be direct about something most AI tool reviews dance around.&lt;/p&gt;

&lt;p&gt;Every image you upload to a cloud video generation service is stored on someone else's server. Every prompt you type is logged. Every generation becomes part of your usage profile. The terms of service you clicked through without reading probably give them broad rights to that data.&lt;/p&gt;

&lt;p&gt;I'm not being paranoid. This is just how SaaS works.&lt;/p&gt;

&lt;p&gt;Local inference changes the equation completely. The model lives on your hard drive. Inference runs on your GPU. The output files go to your output folder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The entire pipeline is air-gapped from the internet.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No usage logs. No content moderation API calls. No third party with visibility into what you're creating.&lt;/p&gt;

&lt;p&gt;If you're working on creative projects that might not survive a content policy review — not because they're harmful, but because algorithms are bad at context — this matters.&lt;/p&gt;

&lt;p&gt;What you create is between you and your hardware.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hardware Requirements
&lt;/h2&gt;

&lt;p&gt;Here's what you actually need:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Minimum&lt;/th&gt;
&lt;th&gt;Recommended&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPU&lt;/td&gt;
&lt;td&gt;RTX 3080 10GB&lt;/td&gt;
&lt;td&gt;RTX 4080 16GB+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAM&lt;/td&gt;
&lt;td&gt;32GB (text encoder on CPU)&lt;/td&gt;
&lt;td&gt;64GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;30GB free&lt;/td&gt;
&lt;td&gt;50GB+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS&lt;/td&gt;
&lt;td&gt;Windows 10/11&lt;/td&gt;
&lt;td&gt;Windows 11&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Model files you need:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Main model: &lt;code&gt;LTX-2.3-distilled-Q4_K_S.gguf&lt;/code&gt; (~10.7GB)&lt;/li&gt;
&lt;li&gt;Text encoder: Gemma 3 12B fp4 + LTX text projection layer&lt;/li&gt;
&lt;li&gt;Video VAE: &lt;code&gt;LTX23_video_vae_bf16.safetensors&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Audio VAE: &lt;code&gt;LTX23_audio_vae_bf16.safetensors&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;LoRA: &lt;code&gt;LTX-2-Image2Vid-Adapter.safetensors&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your VRAM is under 12GB, the text encoder (Gemma 3 12B) will run on CPU. You'll need 32GB of system RAM for that to work without swapping to disk.&lt;/p&gt;




&lt;h2&gt;
  
  
  One-Click Setup
&lt;/h2&gt;

&lt;p&gt;I've packaged a complete pre-configured environment that includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full ComfyUI installation with all required custom nodes pre-installed&lt;/li&gt;
&lt;li&gt;All model files (no separate downloads needed)&lt;/li&gt;
&lt;li&gt;A Gradio web interface — just open a browser, upload an image, write a prompt, hit generate&lt;/li&gt;
&lt;li&gt;Pre-tuned workflow matching the settings that produced the videos above&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Double-click &lt;code&gt;01-run.bat&lt;/code&gt;. Browser opens. Generate.&lt;/p&gt;

&lt;p&gt;No Python environment setup. No node installation. No YAML configuration. It just works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Download: &lt;a href="https://www.patreon.com/localai" rel="noopener noreferrer"&gt;https://www.patreon.com/localai&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  A Note on What This Enables
&lt;/h2&gt;

&lt;p&gt;I've been running local AI models for a few years now. What's changed recently isn't the existence of local models — it's the capability gap closing.&lt;/p&gt;

&lt;p&gt;Twelve months ago, local video generation was a curiosity. The outputs were bad enough that cloud services, despite their restrictions, were clearly better.&lt;/p&gt;

&lt;p&gt;That's no longer true.&lt;/p&gt;

&lt;p&gt;LTX 2.3 at Q4 quantization produces outputs that are competitive with mid-tier cloud services. And it does something cloud services can't do by design: it generates audio and video together, in a single pass, with no content filtering, on hardware you own.&lt;/p&gt;

&lt;p&gt;That's a meaningful shift.&lt;/p&gt;

&lt;p&gt;The technology for completely private, unrestricted, high-quality video generation now fits on a consumer GPU. What people do with that capability — the creative projects they pursue, the content they make — is genuinely up to them.&lt;/p&gt;

&lt;p&gt;That's new.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Download the one-click package: &lt;a href="https://www.patreon.com/posts/ltx-2-3-locally-152521808" rel="noopener noreferrer"&gt;https://www.patreon.com/posts/ltx-2-3-locally-152521808&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Running questions? Drop a comment. I respond to most of them.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>deeplearning</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Applying Constrained Generative AI to USPTO Design Patent Figure Production</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sat, 07 Mar 2026 12:29:11 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/applying-constrained-generative-ai-to-uspto-design-patent-figure-production-30ck</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/applying-constrained-generative-ai-to-uspto-design-patent-figure-production-30ck</guid>
      <description>&lt;p&gt;Design patent drawings present an underexplored constrained generation problem: produce technically compliant multi-view technical illustrations from 3D inputs, where compliance is formally specified and verifiable.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Task Definition
&lt;/h3&gt;

&lt;p&gt;USPTO design patent applications require a minimum of &lt;strong&gt;7 orthographic and perspective views&lt;/strong&gt; of the claimed design. Compliance constraints include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Line weight consistency&lt;/strong&gt; across all views (rejections triggered by inter-view variance)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Surface shading conventions&lt;/strong&gt; derived from historical technical illustration standards (contour shading, stippling for transparency, oblique strokes for metallic surfaces)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Broken line semantics&lt;/strong&gt;: dashed lines indicate &lt;em&gt;unclaimed&lt;/em&gt; subject matter; solid lines indicate &lt;em&gt;claimed&lt;/em&gt; subject matter. The boundary must be unambiguous across all views simultaneously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completeness&lt;/strong&gt;: every geometric feature visible in any view must be consistently disclosed in all views where it would be visible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a structured prediction problem with a formal evaluation criterion (USPTO examiner acceptance / 112 rejection rate).&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Is Non-Trivial
&lt;/h3&gt;

&lt;p&gt;Standard image-to-image or 3D rendering pipelines don't solve this directly. Challenges include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cross-view consistency&lt;/strong&gt; is a global constraint—not a per-image property. Each view must be checked against all others.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shading semantics&lt;/strong&gt; are perceptual, not photorealistic. Patent shading communicates &lt;em&gt;surface type&lt;/em&gt; to a human examiner, not lighting simulation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Broken line placement&lt;/strong&gt; is a legal/strategic decision, not a visual one. The model must support human-in-the-loop control over claim boundary designation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain gap&lt;/strong&gt;: training data for USPTO-compliant line art is limited compared to general CAD or technical illustration datasets.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  PatentFig: Applied Approach
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;&lt;strong&gt;PatentFig&lt;/strong&gt;&lt;/a&gt; is a production system we built to address this pipeline. It accepts 3D models, CAD screenshots, or sketches as input and generates USPTO-compliant multi-view figures.&lt;/p&gt;

&lt;p&gt;Key design decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Separate the &lt;strong&gt;geometric projection&lt;/strong&gt; step (deterministic, rule-based) from the &lt;strong&gt;stylization&lt;/strong&gt; step (learned)&lt;/li&gt;
&lt;li&gt;Human-controlled broken-line toggle rather than attempting to infer claim strategy from geometry&lt;/li&gt;
&lt;li&gt;Output validated against known rejection categories before delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system currently handles the full 7-view generation workflow and is live at &lt;strong&gt;&lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;patentfig.ai&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open Questions
&lt;/h3&gt;

&lt;p&gt;Genuinely curious whether anyone in this community has worked on related problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-view consistency as a training objective (beyond just single-image quality)&lt;/li&gt;
&lt;li&gt;Domain adaptation for technical illustration styles with small training sets&lt;/li&gt;
&lt;li&gt;Formal verification of structured visual output against rule-based compliance specs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy to discuss technical tradeoffs or share more about the architecture. Comments open.&lt;/p&gt;

&lt;p&gt;→ &lt;strong&gt;&lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;patentfig.ai&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




</description>
      <category>ai</category>
      <category>automation</category>
      <category>design</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>InfiniteTalk: I Gave a Portrait a Voice. It Took One Audio File and Zero Cloud Services.</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Sat, 21 Feb 2026 03:35:34 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/infinitetalk-i-gave-a-portrait-a-voice-it-took-one-audio-file-and-zero-cloud-services-2g8</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/infinitetalk-i-gave-a-portrait-a-voice-it-took-one-audio-file-and-zero-cloud-services-2g8</guid>
      <description>&lt;h1&gt;
  
  
  InfiniteTalk: I Gave a Portrait a Voice. It Took One Audio File and Zero Cloud Services.
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9njwxoruf632q7n8iyxk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9njwxoruf632q7n8iyxk.png" alt="Cover" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last month, a client asked me to create a product demo video with a real human presenter.&lt;/p&gt;

&lt;p&gt;Outsourcing quote: $1,100.&lt;/p&gt;

&lt;p&gt;What I actually spent: three days and electricity.&lt;/p&gt;

&lt;p&gt;Here's how.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Every "AI Avatar" Tool I've Tried
&lt;/h2&gt;

&lt;p&gt;I've tested most of the major players. HeyGen. D-ID. Synthesia. Runway.&lt;/p&gt;

&lt;p&gt;They work. But they come with baggage:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They're expensive.&lt;/strong&gt; You get a few minutes of generation time and then you're paying again. Fine for one-offs. Terrible for any kind of volume.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They log everything.&lt;/strong&gt; Every portrait you upload, every script you type—it lives on their servers. I found this out the uncomfortable way when a roleplay scenario I was working on got flagged by their content moderation. Nothing illegal. Just "not within acceptable use."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The output feels dead.&lt;/strong&gt; The mouth moves. Everything else doesn't. No head micro-movements. No blinking. No natural shoulder motion. It looks like a talking photograph, not a person.&lt;/p&gt;

&lt;p&gt;I needed something local.&lt;/p&gt;




&lt;h2&gt;
  
  
  Found on GitHub at 1 AM
&lt;/h2&gt;

&lt;p&gt;I was scrolling through GitHub trending when I found &lt;strong&gt;InfiniteTalk&lt;/strong&gt; by MeiGen-AI.&lt;/p&gt;

&lt;p&gt;Three lines in the README made me stop:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Unlimited-length talking video generation"&lt;br&gt;
"lip sync + head movements + body posture + facial expressions"&lt;br&gt;
"runs locally on consumer hardware"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The model is built on Wan2.1—the same model family that's been quietly dominating the open-source video generation space.&lt;/p&gt;

&lt;p&gt;I cloned the repo.&lt;/p&gt;




&lt;h2&gt;
  
  
  The First Result Stopped Me Cold
&lt;/h2&gt;

&lt;p&gt;One portrait. One audio clip. Thirty seconds of generation time.&lt;/p&gt;


  


&lt;p&gt;The lips moved. I expected that.&lt;/p&gt;

&lt;p&gt;What I didn't expect: &lt;strong&gt;the head tilted slightly. The eyes blinked. The shoulders had that subtle rise-and-fall you get when someone's actually speaking.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not mechanical bobbing. Not a canned animation loop. The kind of micro-movement that happens when a person's body is actually responding to what they're saying.&lt;/p&gt;

&lt;p&gt;I generated it again with different audio. Same natural quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Works When Others Don't
&lt;/h2&gt;

&lt;p&gt;Traditional lip-sync tools—SadTalker, MuseTalk, most of what you'll find on GitHub—share a fundamental approach: &lt;strong&gt;they only touch the mouth.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Take a video, isolate the mouth region, replace it with audio-driven mouth movement, leave everything else alone.&lt;/p&gt;

&lt;p&gt;The problem is obvious once you say it out loud: when a real person talks, nothing is stationary. The head nods. The brow moves. The shoulders track breathing.&lt;/p&gt;

&lt;p&gt;Fix only the mouth and you get an uncanny valley effect that's hard to articulate but immediately obvious.&lt;/p&gt;

&lt;p&gt;InfiniteTalk takes a different approach. It doesn't patch a video. &lt;strong&gt;It generates a new one.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Input: portrait + audio.&lt;br&gt;
Output: a video synthesized from scratch, where audio drives not just the lips but the entire body's motion pattern.&lt;/p&gt;

&lt;p&gt;The benchmark numbers back this up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;InfiniteTalk lip error: &lt;strong&gt;1.8mm&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;MuseTalk: 2.7mm&lt;/li&gt;
&lt;li&gt;SadTalker: 3.2mm&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That 0.9mm gap between InfiniteTalk and MuseTalk is the difference between "convincing" and "almost convincing."&lt;/p&gt;




&lt;h2&gt;
  
  
  What "Unlimited Length" Actually Means
&lt;/h2&gt;

&lt;p&gt;Default generation is 81 frames—about 3 seconds at 25fps.&lt;/p&gt;

&lt;p&gt;But 3 seconds isn't a ceiling. It's a unit.&lt;/p&gt;

&lt;p&gt;InfiniteTalk uses a &lt;strong&gt;sparse-frame context window&lt;/strong&gt;: after each chunk generates, it passes the final frames forward as reference material for the next chunk. The result is seamless continuity—same identity, same background stability, same audio-lip alignment—across arbitrarily long videos.&lt;/p&gt;

&lt;p&gt;I tested a 3-minute clip. No identity drift. No background flicker. Lip sync held throughout.&lt;/p&gt;

&lt;p&gt;Here's a second example:&lt;/p&gt;


  





&lt;h2&gt;
  
  
  Hardware Requirements
&lt;/h2&gt;

&lt;p&gt;You don't need a top-tier GPU.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;480p&lt;/strong&gt;: 6GB VRAM minimum&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;720p&lt;/strong&gt;: 16GB+ recommended&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'm running an RTX 3090. A 3-second 480p clip takes 30-60 seconds to generate. Not instant, but perfectly workable for the quality you get.&lt;/p&gt;

&lt;p&gt;Models you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Wan2.1_I2V_14B_FusionX-Q4_0.gguf&lt;/code&gt; (quantized main model, VRAM-friendly)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wan2.1_infiniteTalk_single_fp16.safetensors&lt;/code&gt; (InfiniteTalk patch)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wav2vec2-chinese-base_fp16.safetensors&lt;/code&gt; (audio encoder)&lt;/li&gt;
&lt;li&gt;Supporting VAE, CLIP, LoRA weights&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All available on Hugging Face or regional mirrors.&lt;/p&gt;




&lt;h2&gt;
  
  
  One-Click Setup, No Code Required
&lt;/h2&gt;

&lt;p&gt;We wrapped the ComfyUI workflow in a &lt;strong&gt;Gradio web interface&lt;/strong&gt; for easier use.&lt;/p&gt;

&lt;p&gt;Launch: double-click &lt;code&gt;01-run.bat&lt;/code&gt;. Browser opens to &lt;code&gt;http://localhost:7860&lt;/code&gt; automatically.&lt;/p&gt;

&lt;p&gt;Left panel inputs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Portrait image (any format)&lt;/li&gt;
&lt;li&gt;Audio file (WAV or MP3)&lt;/li&gt;
&lt;li&gt;Text prompt (affects motion style, not content)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Right panel: generated MP4, ready to play and download.&lt;/p&gt;

&lt;p&gt;Advanced settings let you adjust resolution (256–1024px), frame count, and sampling steps. Defaults work fine for most use cases.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Part You're Probably Thinking About
&lt;/h2&gt;

&lt;p&gt;This runs entirely on local hardware.&lt;/p&gt;

&lt;p&gt;No cloud processing. No usage logs. No content moderation system watching what you generate.&lt;/p&gt;

&lt;p&gt;What portrait you use, what audio you provide, what you create with it—&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your hardware. Your call.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'll leave the implications of that to your imagination.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;The client got their video. They asked which production company I'd used.&lt;/p&gt;

&lt;p&gt;I told them I'd generated it at home, on my own machine.&lt;/p&gt;

&lt;p&gt;Two seconds of silence.&lt;/p&gt;

&lt;p&gt;"Can you do the second episode too?"&lt;/p&gt;

&lt;p&gt;Yes.&lt;/p&gt;

&lt;p&gt;One-click download: &lt;strong&gt;&lt;a href="https://www.patreon.com/posts/151286461" rel="noopener noreferrer"&gt;https://www.patreon.com/posts/151286461&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>I tried automating patent figure creation — here's what actually worked</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Fri, 20 Feb 2026 13:46:28 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/i-tried-automating-patent-figure-creation-heres-what-actually-worked-1o0p</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/i-tried-automating-patent-figure-creation-heres-what-actually-worked-1o0p</guid>
      <description>&lt;p&gt;If you've ever had to prepare a patent application, you know the figures are a pain. Not intellectually hard, just tedious and surprisingly expensive if you're outsourcing them.&lt;/p&gt;

&lt;p&gt;I spent some time testing AI tools in this space and want to share what I found for anyone else going through the same process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem with most "AI drawing" tools for patents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most general image generators produce things that look cool but are useless for patent filing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They use shading and color (not allowed)&lt;/li&gt;
&lt;li&gt;No reference numerals&lt;/li&gt;
&lt;li&gt;Non-standard line weights&lt;/li&gt;
&lt;li&gt;No export at required DPI&lt;/li&gt;
&lt;li&gt;Can't generate consistent multi-view sets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I was looking specifically for tools that understand patent drawing requirements, not just "technical illustration."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I tested&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After trying a few options, the one that actually addressed the workflow properly was &lt;a href="https://patentfig.ai" rel="noopener noreferrer"&gt;PatentFig&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Upload a reference image (product photo, sketch, CAD screenshot) &lt;em&gt;or&lt;/em&gt; just describe the invention in text&lt;/li&gt;
&lt;li&gt;Choose which views you need (front, side, top, perspective, cross-section, flowchart...)&lt;/li&gt;
&lt;li&gt;Generate — it produces line art with reference numerals and leader lines&lt;/li&gt;
&lt;li&gt;Modify via chat ("move numeral 3 to the left", "add a detail view of this section")&lt;/li&gt;
&lt;li&gt;Export as PDF/PNG/SVG/TIFF at filing specs&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;What worked well&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flowcharts and block diagrams: excellent. Software patent figures came out clean and filing-ready&lt;/li&gt;
&lt;li&gt;Simple product line art from photos: surprisingly good for consumer goods&lt;/li&gt;
&lt;li&gt;Chat-to-modify: this is the feature that makes iterating actually practical&lt;/li&gt;
&lt;li&gt;Multi-jurisdiction export: switching between USPTO and CNIPA formatting is a dropdown&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What was harder&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex mechanical assemblies with many moving parts: needed more iteration rounds&lt;/li&gt;
&lt;li&gt;Reference numeral placement on crowded figures: sometimes needed manual adjustment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing context&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Free tier gives 20 credits/month (enough to test the workflow). Paid starts at $40/month billed annually. For a patent agent filing regularly, the time savings vs. outsourcing figures probably pays for itself within the first application.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom line&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're a solo inventor, startup founder, or patent agent handling your own figures, this is worth a test on your next application. The free tier is genuinely usable for evaluation.&lt;/p&gt;

&lt;p&gt;Drop a comment if you've tried other approaches — curious what workflows others have landed on.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How I Built an AI Tool for Scientists and Grew to 10K Users as a Solo Founder</title>
      <dc:creator>local ai</dc:creator>
      <pubDate>Mon, 02 Feb 2026 11:46:30 +0000</pubDate>
      <link>https://dev.to/local_ai_28441e061d716cb1/how-i-built-an-ai-tool-for-scientists-and-grew-to-10k-users-as-a-solo-founder-k6b</link>
      <guid>https://dev.to/local_ai_28441e061d716cb1/how-i-built-an-ai-tool-for-scientists-and-grew-to-10k-users-as-a-solo-founder-k6b</guid>
      <description>&lt;p&gt;Hey dev community! 👋&lt;/p&gt;

&lt;p&gt;I want to share my journey building a niche AI SaaS product as a solo founder. Hopefully, some insights here will be helpful for others on a similar path.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem I Noticed
&lt;/h2&gt;

&lt;p&gt;While working with researchers, I noticed a recurring pain point: &lt;strong&gt;scientists spend hours creating figures for their papers&lt;/strong&gt;. Most use PowerPoint or struggle with Illustrator, even though they're not designers. The results often look unprofessional, and the process is frustrating.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://sci-draw.com" rel="noopener noreferrer"&gt;SciDraw&lt;/a&gt; - an AI-powered platform specifically for scientific illustrations. Researchers can describe what they need or upload a rough sketch, and the AI generates publication-ready figures.&lt;/p&gt;

&lt;p&gt;Key features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text-to-image generation for scientific concepts&lt;/li&gt;
&lt;li&gt;Sketch refinement (turn rough drawings into polished diagrams)&lt;/li&gt;
&lt;li&gt;Editable SVG export (so users can fine-tune in any vector editor)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;p&gt;For those curious about the technical side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Next.js + React&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Node.js&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI&lt;/strong&gt;: Custom pipeline combining multiple image generation models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SVG Processing&lt;/strong&gt;: Custom algorithms to ensure clean, editable vector output&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Worked for Growth
&lt;/h2&gt;

&lt;p&gt;I didn't have a marketing budget, so I focused on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;SEO&lt;/strong&gt; - Targeting long-tail keywords like "AI scientific figure generator"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Directory submissions&lt;/strong&gt; - Listed on 80+ platforms (Product Hunt, G2, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Niche focus&lt;/strong&gt; - Instead of competing with general AI image tools, I went deep into one vertical&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Current Stats
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;10,000+ researchers using the platform&lt;/li&gt;
&lt;li&gt;~$2K MRR&lt;/li&gt;
&lt;li&gt;Users from Stanford, MIT, Harvard, and universities worldwide&lt;/li&gt;
&lt;li&gt;Fully bootstrapped, no funding&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Niche down aggressively&lt;/strong&gt; - A small pond is easier to dominate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solve a real pain point&lt;/strong&gt; - Scientists were literally wasting hours; the value prop was obvious&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Talk to users&lt;/strong&gt; - Early feedback shaped the product significantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SEO compounds&lt;/strong&gt; - It's slow at first, but worth the investment&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Currently working on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More diagram types (molecular structures, experimental workflows)&lt;/li&gt;
&lt;li&gt;Collaboration features for research teams&lt;/li&gt;
&lt;li&gt;API for integration with other tools&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Happy to answer any questions about building for the academic market or AI image generation! &lt;/p&gt;

&lt;p&gt;What niche are you building for? 👇&lt;/p&gt;

</description>
      <category>startup</category>
      <category>indiehacker</category>
      <category>ai</category>
      <category>saas</category>
    </item>
  </channel>
</rss>
