<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: András Iványi</title>
    <description>The latest articles on DEV Community by András Iványi (@andyskw).</description>
    <link>https://dev.to/andyskw</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3994933%2F18a421d5-0d9d-4e3c-95c0-ddb81324c2b7.jpg</url>
      <title>DEV Community: András Iványi</title>
      <link>https://dev.to/andyskw</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/andyskw"/>
    <language>en</language>
    <item>
      <title>I AI-remastered a 25-year-old game intro to real 1080p — and learned that the source matters more than the model</title>
      <dc:creator>András Iványi</dc:creator>
      <pubDate>Sun, 21 Jun 2026 05:52:12 +0000</pubDate>
      <link>https://dev.to/andyskw/i-ai-remastered-a-25-year-old-game-intro-to-real-1080p-and-learned-that-the-source-matters-more-gfn</link>
      <guid>https://dev.to/andyskw/i-ai-remastered-a-25-year-old-game-intro-to-real-1080p-and-learned-that-the-source-matters-more-gfn</guid>
      <description>&lt;p&gt;I spent way too long remastering the intro of &lt;em&gt;Imperium Galactica 2 – Solarian&lt;/em&gt;, a space-strategy game from 2000, to a clean 1080p using AI. Along the way I learned a pile of things the hard way — about SeedVR2, about temporal flicker, about running diffusion on a tiny AMD iGPU, and about how little the "big model" actually matters. This is the whole journey, dead ends included, so you don't have to repeat my mistakes.&lt;/p&gt;

&lt;p&gt;Everything (code + the generic pipeline) is here: &lt;strong&gt;&lt;a href="https://github.com/andyskw/ig2-solarian-seedvr2-remaster" rel="noopener noreferrer"&gt;https://github.com/andyskw/ig2-solarian-seedvr2-remaster&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Watch the full remaster:&lt;/strong&gt; &lt;a href="https://youtu.be/zn15PEU9nGY" rel="noopener noreferrer"&gt;English&lt;/a&gt; · &lt;a href="https://www.youtube.com/watch?v=gDFm5QmpbvQ" rel="noopener noreferrer"&gt;Hungarian dub&lt;/a&gt; · &lt;a href="https://www.youtube.com/watch?v=7zh0KB110sE" rel="noopener noreferrer"&gt;side-by-side 50/50 comparison&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/zn15PEU9nGY" rel="noopener noreferrer"&gt;https://youtu.be/zn15PEU9nGY&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5agpzyjc3t45jca47bs7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5agpzyjc3t45jca47bs7.png" alt="Original vs remaster, ship detail" width="800" height="276"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Lesson 0: the source matters more than the upscaler
&lt;/h2&gt;

&lt;p&gt;I had two copies of the same intro: a 360p one (the language dub I wanted) and a 1080p one (a different dub). The 1080p "looked the same" at a glance — but it isn't. A quick sharpness measurement (Laplacian variance) plus zoomed crops showed it carries genuinely more detail (8.7× the bitrate).&lt;/p&gt;

&lt;p&gt;So the single biggest quality jump came from &lt;strong&gt;throwing away my preferred 360p source, upscaling the better 1080p one, and muxing the audio I wanted back at the very end.&lt;/strong&gt; Picture and audio are separable.&lt;br&gt;
If you take one thing from this post: &lt;strong&gt;feed your upscaler the best source you can find.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Getting AI to run on a tiny AMD iGPU
&lt;/h2&gt;

&lt;p&gt;My always-on box has an AMD &lt;strong&gt;Radeon 890M iGPU (gfx1150)&lt;/strong&gt; — no ROCm userspace installed, just the kernel driver. It still works, with two non-obvious tricks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# self-contained ROCm wheel — no system ROCm needed&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--index-url&lt;/span&gt; https://rocm.nightlies.amd.com/v2/gfx1151/ torch torchaudio torchvision
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;HSA_OVERRIDE_GFX_VERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;11.5.1   &lt;span class="c"&gt;# 11.0.0 -&amp;gt; hipErrorInvalidImage; unset -&amp;gt; GPU invisible&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;MIOPEN_FIND_MODE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2                 &lt;span class="c"&gt;# else MIOpen hangs FOREVER on the first VAE conv&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That MIOpen hang cost me ~27 minutes of staring at a GPU pinned at 100% with zero progress before I figured out the FAST find-mode. If you run SeedVR2 (or anything MIOpen-heavy) on a Strix-class iGPU, remember those two lines. :D&lt;/p&gt;

&lt;h2&gt;
  
  
  SeedVR2, and why the model size barely matters
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler" rel="noopener noreferrer"&gt;SeedVR2&lt;/a&gt; gave the most natural result (real surface texture, temporal awareness). I compared 3B vs 7B and... they took almost the same time. Why?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fp16 VAE decode is the bottleneck&lt;/strong&gt;, not the diffusion transformer. Quantizing or shrinking the DiT changes runtime by a rounding error. The 3B-FP8 looked to me more natural, so that's what I used.&lt;/p&gt;

&lt;h2&gt;
  
  
  The flicker hunt: it's &lt;code&gt;latent_noise&lt;/code&gt;, not &lt;code&gt;batch_size&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The intro contains a lot of space fight scenes.&lt;/p&gt;

&lt;p&gt;Small, fast-moving objects (a little fighter, drifting shadows) flickered — the model re-invents their detail every frame (however, after carefully watching, the original scenes also had a similar flickering -&amp;gt; upscaling only made them more visible). The intuitive fix is a bigger temporal batch... and it helps, then &lt;strong&gt;plateaus&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The thing that actually worked was &lt;strong&gt;&lt;code&gt;latent_noise_scale&lt;/code&gt;&lt;/strong&gt;: low on faces (to keep detail), higher on fast motion (to suppress the per-frame re-hallucination). Things that did &lt;strong&gt;not&lt;/strong&gt; work: a temporal&lt;br&gt;
deflicker filter (not motion-compensated) and RIFE frame interpolation (smooth, but the flicker is baked into the generated frames — and 60 fps made me slightly motion-sick).&lt;/p&gt;

&lt;p&gt;Here's the worst offender — the little fighter in the opening space battle (&lt;strong&gt;~0:58&lt;/strong&gt; in the video).&lt;br&gt;
Left = original, right = remaster:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0vht0a3iwsi63gynsjes.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F0vht0a3iwsi63gynsjes.gif" alt="Flickering ship: original vs remaster" width="719" height="208"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Oh, and on low/shared-VRAM GPUs a big batch will OOM the VAE decode — &lt;code&gt;--vae_decode_tiled&lt;/code&gt; fixes that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scene-by-scene, with a human in the loop
&lt;/h2&gt;

&lt;p&gt;For the best result you process &lt;strong&gt;per shot&lt;/strong&gt;: split at every cut, run one batch per shot, and tune&lt;br&gt;
&lt;code&gt;latent_noise&lt;/code&gt; per shot by content (faces ~0.05, fast action ~0.15, tiny fast objects ~0.20). But&lt;br&gt;
getting the cut list was the real fight:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ffmpeg's scene filter missed cuts between visually-similar space shots and over-segmented explosions.
A "72-second shot" was actually ~5 shots.&lt;/li&gt;
&lt;li&gt;PySceneDetect's AdaptiveDetector, calibrated against a few hand-marked cuts, did much better.&lt;/li&gt;
&lt;li&gt;The opening "cut" was a &lt;strong&gt;1-second cross-dissolve&lt;/strong&gt; — invisible to content detectors; only my eye caught it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The workflow that worked: auto-detect → render a &lt;strong&gt;contact sheet&lt;/strong&gt; (one thumbnail per shot) → verify and hand-edit. Treat automatic scene detection as a "draft".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx7afqdsgkbanuwfupzn6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx7afqdsgkbanuwfupzn6.png" alt="Shot contact sheet for verification" width="800" height="508"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The bulk of the hand-editing was &lt;em&gt;merging&lt;/em&gt; false cuts. The rule I settled on: it's the same shot if the camera/subject continues across the "cut" (an explosion flash, fast motion, a fighter crossing frame);&lt;br&gt;
it's only a new shot on a real change of framing — or a dissolve, which detectors miss entirely.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F14pygu1j3b0j7f4votfw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F14pygu1j3b0j7f4votfw.png" alt="Which shots I merged by hand, and why" width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One more trap: &lt;strong&gt;split frame-exactly&lt;/strong&gt;. Time-based &lt;code&gt;-ss/-to&lt;/code&gt; added ~1 frame per cut, which drifted ~2 seconds over the video and broke lip-sync. &lt;code&gt;trim=start_frame:end_frame&lt;/code&gt; gives the exact source frame count. This can save tons of time, and if you catch it only after the whole video is rendered, you will be really pissed off. :) &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh66rmn7nlzewhlcxf8zp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fh66rmn7nlzewhlcxf8zp.png" alt="Original vs remaster, face" width="800" height="225"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  By the numbers: a 5-second proving ground
&lt;/h2&gt;

&lt;p&gt;Here's the thing — I didn't discover any of the above on the full 3.5-minute intro. &lt;strong&gt;Every decision was made on a single 5-second clip&lt;/strong&gt;, looped through the pipeline over and over on the iGPU. Source choice, 3B vs 7B, fp8 vs Q4 vs fp16, batch 5 → 25 → 49 → 73, tiling on/off, &lt;code&gt;latent_noise&lt;/code&gt; 0 → 0.1 → 0.15, plus the deflicker and RIFE dead ends — roughly &lt;strong&gt;a dozen full SeedVR2 runs&lt;/strong&gt; on that one clip (two of which OOM'd and had to be killed), adding up to &lt;strong&gt;~16 hours of iGPU compute just to dial in the recipe&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Out of all that came &lt;strong&gt;~15 side-by-side comparison reels&lt;/strong&gt; — and the genuinely fun part was watching the same five seconds run 2, 3, even 4 ways at once:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2iquu6rdlvowgolbb1pc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F2iquu6rdlvowgolbb1pc.png" alt="A few of the comparison reels" width="800" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Iterating on a short clip first is the whole trick: cheap enough to be patient, long enough to judge temporal behavior. Only once those 5 seconds looked right did I commit a single minute to the full render.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hardware reality: iGPU days vs cloud hours
&lt;/h2&gt;

&lt;p&gt;On the 890M iGPU the full run extrapolated to &lt;strong&gt;~74 hours&lt;/strong&gt; (~40 s/frame). So I rented an &lt;strong&gt;RTX PRO 6000 (Blackwell, 96 GB)&lt;/strong&gt; on vast.ai for ~$1.1/hour:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blackwell (sm_120) needs &lt;strong&gt;CUDA 12.8&lt;/strong&gt; torch (PyTorch 2.12 / cuDNN 9.2).&lt;/li&gt;
&lt;li&gt;Encode dropped from ~12 &lt;strong&gt;minutes&lt;/strong&gt;/batch to ~&lt;strong&gt;28 seconds&lt;/strong&gt;/batch. With 96 GB: no tiling, whole-shot
batches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The whole 3.5-minute intro: 2h 21m, about $2.70.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A few cloud gotchas that ate my time: &lt;code&gt;pkill -f inference_cli.py&lt;/code&gt; matched its own command line and killed my shell; moving "just the home folder" to a new box left the Python deps behind; and the rented box's stale CA bundle made &lt;code&gt;curl&lt;/code&gt; reject valid certs (&lt;code&gt;certificate has expired&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Here's what that full render buys you on a wide shot — the same 50/50 split, remaster on the left:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftnflthc8tr5u4js8ho5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Ftnflthc8tr5u4js8ho5w.png" alt="Remaster vs original, landscape (50/50 split)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Prefer to scrub through it yourself? Here's the full 50/50 split, start to finish:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=7zh0KB110sE" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=7zh0KB110sE&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Finishing + a YouTube tip
&lt;/h2&gt;

&lt;p&gt;Mux the audio onto the silent master (&lt;code&gt;-c:v copy&lt;/code&gt;); the frame-exact split keeps it in sync. And when you upload, &lt;strong&gt;upscale to 4K first.&lt;/strong&gt; YouTube assigns bitrate by resolution tier, so a 4K upload preserves your 1080p content far better through its re-encode than a native 1080p upload.&lt;/p&gt;

&lt;h2&gt;
  
  
  The recipe (use the repo)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Pick the best-quality source; keep audio separate.&lt;/li&gt;
&lt;li&gt;Detect shots → &lt;strong&gt;verify by hand&lt;/strong&gt; (contact sheet); watch for dissolves.&lt;/li&gt;
&lt;li&gt;Frame-exact split → per-shot SeedVR2 (3B-FP8) with &lt;strong&gt;per-shot &lt;code&gt;latent_noise&lt;/code&gt;&lt;/strong&gt;, batch = shot length.&lt;/li&gt;
&lt;li&gt;Low-VRAM? &lt;code&gt;--vae_decode_tiled&lt;/code&gt;. Big cloud GPU? whole-shot batches.&lt;/li&gt;
&lt;li&gt;Concat → mux audio → upload at 4K.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The generic, resume-safe pipeline (CUDA &lt;strong&gt;and&lt;/strong&gt; AMD ROCm), the shot detector, and the full writeup are all here:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://github.com/andyskw/ig2-solarian-seedvr2-remaster" rel="noopener noreferrer"&gt;https://github.com/andyskw/ig2-solarian-seedvr2-remaster&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Credits: &lt;a href="https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler" rel="noopener noreferrer"&gt;SeedVR2&lt;/a&gt; (ByteDance Seed · NumZ · AInVFX), &lt;a href="https://www.scenedetect.com/" rel="noopener noreferrer"&gt;PySceneDetect&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>video</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
