<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aleksandr Gushchin</title>
    <description>The latest articles on DEV Community by Aleksandr Gushchin (@aleksandrgushchin).</description>
    <link>https://dev.to/aleksandrgushchin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F634382%2F0a9d12a8-fde6-40af-a948-42db092cf241.png</url>
      <title>DEV Community: Aleksandr Gushchin</title>
      <link>https://dev.to/aleksandrgushchin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aleksandrgushchin"/>
    <language>en</language>
    <item>
      <title>GSOC-2021 Work Product Submission, Xiph.Org Foundation</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 20 Aug 2021 21:43:57 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/gsoc-2021-work-product-submission-xiph-org-foundation-3o9e</link>
      <guid>https://dev.to/aleksandrgushchin/gsoc-2021-work-product-submission-xiph-org-foundation-3o9e</guid>
      <description>&lt;p&gt;&lt;strong&gt;Student:&lt;/strong&gt; Aleksandr Gushchin&lt;br&gt;
&lt;strong&gt;Github Handle:&lt;/strong&gt; &lt;a class="mentioned-user" href="https://dev.to/aleksandrgushchin"&gt;@aleksandrgushchin&lt;/a&gt;
&lt;br&gt;
&lt;strong&gt;Project:&lt;/strong&gt; Improve fast scene detection modes proposal&lt;br&gt;
&lt;strong&gt;Mentor:&lt;/strong&gt; Luca Barbato&lt;br&gt;
&lt;strong&gt;Organisation:&lt;/strong&gt; Xiph.Org Foundation&lt;/p&gt;

&lt;h3&gt;
  
  
  Goals
&lt;/h3&gt;

&lt;p&gt;This summer, I contributed to Xiph.Org Foundation. The main aim of this project was to improve scene change detection algorithm. This algorithm determines where to split video sequences for optimal encoding efficiency. Currently implemented fast scene detection method is not optimal and sometimes give false results. This is also detrimental to per scene visual metric quality targeting. &lt;/p&gt;

&lt;h3&gt;
  
  
  Change Log
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Dataset has been made to test the algorithm&lt;/li&gt;
&lt;li&gt;Metric value peaks have been made more distinctive for algorithm to detect resulting in better accuracy&lt;/li&gt;
&lt;li&gt;Adjusting threshold for both versions of the algorithm&lt;/li&gt;
&lt;li&gt;Adaptive threshold implementation for slow version&lt;/li&gt;
&lt;li&gt;The more accurate version of the algorithm has been implemented&lt;/li&gt;
&lt;li&gt;Downsampling for this new version has been added&lt;/li&gt;
&lt;li&gt;Detailed description for all three versions has been added&lt;/li&gt;
&lt;li&gt;CLI option of scene detection speed mode has been added&lt;/li&gt;
&lt;li&gt;Unit-tests have been updated according to new version&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/rust-av/av-scenechange"&gt;av-scenechange&lt;/a&gt; has been updated according to the new version of rav1e

&lt;ul&gt;
&lt;li&gt;CLI option of scene detection speed mode has been added&lt;/li&gt;
&lt;li&gt;CLI option for file to write result in has been added&lt;/li&gt;
&lt;li&gt;Speed measurement has been added &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Brief summary of new version of the algorithm
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;F score on BBC Planet Earth&lt;/th&gt;
&lt;th&gt;F score on open source videos&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;New&lt;/strong&gt; fast version&lt;/td&gt;
&lt;td&gt;0.7441&lt;/td&gt;
&lt;td&gt;0.6652&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Old fast version&lt;/td&gt;
&lt;td&gt;0.6543&lt;/td&gt;
&lt;td&gt;0.5951&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;New&lt;/strong&gt; medium version&lt;/td&gt;
&lt;td&gt;0.7802&lt;/td&gt;
&lt;td&gt;0.7032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;New&lt;/strong&gt; slow version&lt;/td&gt;
&lt;td&gt;0.9217&lt;/td&gt;
&lt;td&gt;0.7504&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Old slow version&lt;/td&gt;
&lt;td&gt;0.7024&lt;/td&gt;
&lt;td&gt;0.5628&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Development process
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;To fairly test the algorithm I needed a big representative dataset. I found &lt;a href="https://aimagelab.ing.unimore.it/imagelab/researchActivity.asp?idActivity=19"&gt;BBC Planet Earth dataset&lt;/a&gt;, but I still needed sequences with bigger resolutions and different theme (all of BBC videos were documentaries with 388x280 resolution). I downloaded and manually marked-up 20 videos from vimeo. More description of final dataset can be found &lt;a href="https://dev.to/aleksandrgushchin/dataset-for-scene-change-detection-4bf1"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;After collecting the data I calculate the results of current solution. It can be found &lt;a href="https://dev.to/aleksandrgushchin/results-of-current-algorithm-to-be-updated-o2f"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Detailed analysis of the current algorithm. I made charts and visualizations regarding different algorithm options and threshold. It can be found &lt;a href="https://dev.to/aleksandrgushchin/july-12-june-19-weekly-status-1l97"&gt;here&lt;/a&gt; and &lt;a href="https://dev.to/aleksandrgushchin/juky-19-july-26-weekly-status-new-scene-change-detector-of-the-rav1e-analysis-348"&gt;here&lt;/a&gt;. I made several conclusions on how to improve current solution. &lt;/li&gt;
&lt;li&gt;Improve current solution by adjusting thresholds and update metric values. Detailed description is &lt;a href="https://dev.to/aleksandrgushchin/threshold-experiments-for-scene-change-detector-2hbp"&gt;here&lt;/a&gt; and &lt;a href="https://dev.to/aleksandrgushchin/july-26-august-02-weekly-status-49gb"&gt;here&lt;/a&gt;. I made a &lt;a href="https://github.com/xiph/rav1e/pull/2765"&gt;pull request&lt;/a&gt; with these changes&lt;/li&gt;
&lt;li&gt;New metrics development. I experimented with motion vectors and color histograms to build a new dissimilarity metric upon them. For histogram-based metric I experimented also with distance functions. I tried to implement edge change ratio but failed because it turned out to be too slow. I focused on histogram-based metric since it was the most accurate. I experimented with block-based approach, combining with previous versions of the algorithm and shifting by motion vectors. Results can be found &lt;a href="https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a"&gt;here&lt;/a&gt; and &lt;a href="https://dev.to/aleksandrgushchin/august-09-august-16-weekly-status-1k9k"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;After third version was ready I added it to repo, provide CLI option for users  to manually choose version and update unit-tests.&lt;/li&gt;
&lt;li&gt;Detailed description of final result you can read &lt;a href="https://dev.to/aleksandrgushchin/new-scene-change-detector-version-4ja7"&gt;here&lt;/a&gt; alongside with unsuccessful ideas and possible improvements.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Code
&lt;/h3&gt;

&lt;p&gt;Pull requests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/xiph/rav1e/pull/2765"&gt;#2765&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/rust-av/av-scenechange/pull/162"&gt;#162&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Blog posts
&lt;/h3&gt;

&lt;p&gt;All posts can be found &lt;a href="https://dev.to/aleksandrgushchin"&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Acknowledgement
&lt;/h3&gt;

&lt;p&gt;I'd like to thank my mentor Luca Barbato for always monitoring my progress, immediately responding and guiding me whenever I needed help and whole Xiph team! &lt;/p&gt;

</description>
    </item>
    <item>
      <title>New Scene Change Detector version </title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Wed, 18 Aug 2021 07:59:14 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/new-scene-change-detector-version-4ja7</link>
      <guid>https://dev.to/aleksandrgushchin/new-scene-change-detector-version-4ja7</guid>
      <description>&lt;p&gt;There are three versions of the algorithm based on speed setting of rav1e. Detailed description of each version is down below. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fast version - pixel-based version with improved threshold. 

&lt;ul&gt;
&lt;li&gt;Corresponds to speed level 10 of ravie&lt;/li&gt;
&lt;li&gt;Performs a downsampling&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Medium version - based on motion vectors with iproved threshold. 

&lt;ul&gt;
&lt;li&gt;Corresponds to speed level 7-9 of ravie&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Slow version - histogram metric with block-based approach. 

&lt;ul&gt;
&lt;li&gt;Corresponds to speed level 0-6 of ravie&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;F score on BBC Planet Earth&lt;/th&gt;
&lt;th&gt;F score on open source videos&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;New fast version&lt;/td&gt;
&lt;td&gt;0.7441&lt;/td&gt;
&lt;td&gt;0.6652&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Old fast version&lt;/td&gt;
&lt;td&gt;0.6543&lt;/td&gt;
&lt;td&gt;0.5951&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New medium version&lt;/td&gt;
&lt;td&gt;0.7802&lt;/td&gt;
&lt;td&gt;0.7032&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New slow version&lt;/td&gt;
&lt;td&gt;0.9217&lt;/td&gt;
&lt;td&gt;0.7504&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Old slow version&lt;/td&gt;
&lt;td&gt;0.7024&lt;/td&gt;
&lt;td&gt;0.5628&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So the F score of the fast version is improved by 0.0898 on BBC and 0.0701 on open sorce videos.&lt;br&gt;
So the F score of the slow version is improved by 0.2193 on BBC and 0.1876 on open sorce videos.&lt;/p&gt;

&lt;h3&gt;
  
  
  Desciption of each version
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fast version&lt;/strong&gt; is a simple calculation of the pixel-wise difference. For each corresponding pixel the difference of values is taken and summed up. The final dissimilarity metric is the average values of all pixels. I improve the old version by adjusting threshold and modifying the metric itself by calculating numerical differentive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium version&lt;/strong&gt; is improved version of the old slow version with adaptive threshold. To build dissimilarity metric motion vectors between two consecutive frames are computed. Frames are divided into blocks and each block on the second frame is shifted by motion vector. Dissimilarity metric is the average difference between all blocks. I improve the old version by adjusting threshold and modifying the metric itself by calculating numerical differentive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slow version&lt;/strong&gt; block-based histogram metric. Frames are divided into non-overlapping blocks. Then the mean value of this histogram is calculated and compared with the value of the corresponding block. Dissimilarity metric is the average difference between all blocks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Results and examples of each version
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Slow version
&lt;/h4&gt;

&lt;p&gt;The slow version is marked in the legend as &lt;em&gt;"with blocks"&lt;/em&gt;. &lt;em&gt;"Without blocks"&lt;/em&gt; is the similar metric but without division of frames into blocks.&lt;br&gt;
Results on BBC dataset:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qdIdJOXX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p7bc3kldc06bc7f4garh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qdIdJOXX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p7bc3kldc06bc7f4garh.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--u8q9IxrL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/esey2w2bo3y1brmlf5ip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--u8q9IxrL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/esey2w2bo3y1brmlf5ip.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
Results on open-source videos:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nuWN50Kk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x9en1owgh6oyono62926.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nuWN50Kk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x9en1owgh6oyono62926.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tgRQQ699--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vz2s2usqwwnm8l5kb354.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tgRQQ699--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vz2s2usqwwnm8l5kb354.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Medium version
&lt;/h4&gt;

&lt;p&gt;Here you can see charts of performance (F score, precision and recall) of the algorith depending on the threshold. These results were obtained with open-sourced videos from youtube.com and vimeo.com&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_rAJMCQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ttodfkd2sku61basvx57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_rAJMCQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ttodfkd2sku61basvx57.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jMyRq3lp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/747caqwhnz1q2em3yt2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jMyRq3lp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/747caqwhnz1q2em3yt2t.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xqIF8fJl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ma8vfvjv6z2n8q4efgb3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xqIF8fJl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ma8vfvjv6z2n8q4efgb3.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here also the results on BBC Planet Earth datset:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--apQvJbWQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rva4r30vpwrbpl0temb4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--apQvJbWQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rva4r30vpwrbpl0temb4.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5xW58o8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ilmo9bhgq93ce94qh9rd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5xW58o8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ilmo9bhgq93ce94qh9rd.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5JtkzDsS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7keabmbbpvj32i5t5mek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5JtkzDsS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7keabmbbpvj32i5t5mek.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
And precision-recall curve:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fMKVs_uG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emvqhgte8av30aywh7ok.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fMKVs_uG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emvqhgte8av30aywh7ok.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Fast version
&lt;/h4&gt;

&lt;p&gt;Here you can see experiments with threshold for fast version. The bold line represents old fast version, the bottom line here is the old slow version of the algorithm:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YoU0igok--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/or7tm3f2g8ill22y3auq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YoU0igok--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/or7tm3f2g8ill22y3auq.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Detailed analysis can be seen here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/aleksandrgushchin/threshold-experiments-for-scene-change-detector-2hbp"&gt;Fast version&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a"&gt;Medium version&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a"&gt;Slow version&lt;/a&gt; at the end of the post &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Speed
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Average FPS on BBC Planet Earth (360x288)&lt;/th&gt;
&lt;th&gt;Average FPS on open source videos (1280x720)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fast version&lt;/td&gt;
&lt;td&gt;234&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium version&lt;/td&gt;
&lt;td&gt;222&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slow version&lt;/td&gt;
&lt;td&gt;156&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Overall metric improvement
&lt;/h3&gt;

&lt;p&gt;Here I want to show how I improved metric values in all version from the old ones.&lt;br&gt;
Blue line represent values of algorithm's metric on frames, orange - threshold, gray lines represent if algorithm marked this frame as scene change.&lt;br&gt;
Here is the example of the outcome on one of the videos:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--F7VlRXp9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vhm4qvkq2blgohnvsjmg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--F7VlRXp9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vhm4qvkq2blgohnvsjmg.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
The top picture shows original metric values, the bottom one shows metric after improvement.&lt;br&gt;
You can see that the peaks with the scene changes became more distinct, so the threshold is easier to tune.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unsuccessful ideas
&lt;/h3&gt;

&lt;p&gt;Here is a list of ideas that I implemented, but they turned out to be impractical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slow version with motion vectors

&lt;ul&gt;
&lt;li&gt;Each block is shifted by motion vectors. It slowed down algorithm even more and decreased F score.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Combining medium version with the slow one

&lt;ul&gt;
&lt;li&gt;The idea was to marked frames as scene change if one of version said so. Again, it slowed down algorithm and did not bring any gain to F score. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Separate metric for flashes

&lt;ul&gt;
&lt;li&gt;I implemented a few metrics for flashes detection and deployed them. But ofted flashes occurs several frames in a row and and contains scene change. Because of this it is difficult for algorithm to decide if these flashes contains scene change or not. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Possible improvements:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Threshold 

&lt;ul&gt;
&lt;li&gt;Possible dependency on metric values (maximum value on past frames, mean and std values). Current threshold can perform worse then static threshold in some cases. The example is on the picture. It can be seen that threshold varies around the same value. If it took into account the mean value of past frames, for example, it would be more accurate:
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SjGOLF-d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lpm4raohfs4jfo30w87n.png" alt="Alt Text"&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Version based on edge detection

&lt;ul&gt;
&lt;li&gt;It can be useful to take into accound another feature of frames - object edges. Combining with existing versions it can boost F score&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Metric

&lt;ul&gt;
&lt;li&gt;Adjusting metric values according to nearest values. For example, by substructing the mean value of surrounding frames. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Block-based metric improvement

&lt;ul&gt;
&lt;li&gt;It may be useful to experiment with the blocks individually rather than just taking the mean value of all them all. For example, if difference between &lt;em&gt;k&lt;/em&gt; blocks is near zero algorithm should'n mark this frame as scene change no matter what other blocks has. Or if &lt;em&gt;k&lt;/em&gt; blocks have difference about maximum of possible value algorithm mark this frame as scene change no matter what other blocks has.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Dowsampling for medium and slow versions:

&lt;ul&gt;
&lt;li&gt;For videos with high resolution it may be considered to dowsample them to HD or so. This will significantly increase the speed, but will have a small impact on F score.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>August 09 - August 16 Weekly Status</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 13 Aug 2021 20:00:20 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/august-09-august-16-weekly-status-1k9k</link>
      <guid>https://dev.to/aleksandrgushchin/august-09-august-16-weekly-status-1k9k</guid>
      <description>&lt;p&gt;This week I finished analysis of the new metric based on color histograms. It can be read &lt;a href="https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I experimented with combining this new metric with the current one. Here are the F scores on BBC Planet Earth dataset for different versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current rav1e version: 

&lt;ul&gt;
&lt;li&gt;0.7024&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Improved threshold (version of my latest pr): 

&lt;ul&gt;
&lt;li&gt;0.8081&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Version with color histogram based metric: 

&lt;ul&gt;
&lt;li&gt;0.8502&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Union of the latest two versions (frame is considered to be a scene change if one of the above-mentioned metrics said so): 

&lt;ul&gt;
&lt;li&gt;0.8923&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Version with color histogram metric with block-based approach, when each frame is divided into blocks (more details on the bottom): 

&lt;ul&gt;
&lt;li&gt;0.9217&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I say the intersection of algorithms, I mean an algorithm that marks the frame as scene change only if both algorithms have marked it. &lt;br&gt;
Here is the picture that explains why I chose union rather than an intersection. It should also be taken into account that the recall of the algorithms is higher than accuracy:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yVeIx9Id--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tehmqzl1u3lnfnpt7cih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yVeIx9Id--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tehmqzl1u3lnfnpt7cih.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
Numbers represent the amount of frames considered as scene changes in two versions of the algorithm and ground truth. &lt;br&gt;
Each number represents one color area.&lt;br&gt;
It can be seen that ground truth contains around 90% of the intersection of these versions. &lt;/p&gt;

&lt;p&gt;I improved histogram-based metric by dividing frames into blocks. The results for it can be seen below along with the regular histogram-based approach. The improved version is marked in the legend as &lt;em&gt;"with blocks"&lt;/em&gt;&lt;br&gt;
Results on BBC dataset:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qdIdJOXX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p7bc3kldc06bc7f4garh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qdIdJOXX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p7bc3kldc06bc7f4garh.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--u8q9IxrL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/esey2w2bo3y1brmlf5ip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--u8q9IxrL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/esey2w2bo3y1brmlf5ip.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
Results on open-source videos:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nuWN50Kk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x9en1owgh6oyono62926.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nuWN50Kk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x9en1owgh6oyono62926.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tgRQQ699--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vz2s2usqwwnm8l5kb354.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tgRQQ699--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vz2s2usqwwnm8l5kb354.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But&lt;/strong&gt; the calculation speed became &lt;strong&gt;0.75x&lt;/strong&gt; of the current version on the BBC dataset (resolution 360x288) and &lt;strong&gt;0.56x&lt;/strong&gt; on open-source videos (resolution 1280x720). &lt;/p&gt;

&lt;p&gt;I check if it worth it to combine this metric with the current one. The average increase in the F score is about 0.01-0.02, which, considering the even greater decrease in the speed rate, is unreasonable.&lt;/p&gt;

&lt;p&gt;Also, I implemented block-based histogram approach considering the motion vectors. The results will be published here soon.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>August 02 - August 09 Weekly Status</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 06 Aug 2021 20:46:25 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a</link>
      <guid>https://dev.to/aleksandrgushchin/august-02-august-03-weekly-status-565a</guid>
      <description>&lt;p&gt;This week I experimented with new metrics, which I implemented. &lt;/p&gt;

&lt;p&gt;I implemented 2 metrics based on color histogram and motion vectors. Since motion vectors already used in current version of the algorithm I focused on hostogram metric. Code for it can be found &lt;a href="https://github.com/alexlqrs/rav1e/blob/master/src/api/lookahead.rs"&gt;here&lt;/a&gt;. I used histogram crate for this implementation. I calculate histogram of the first plane of the frame (luma component) and compare it to the previous frame's histogram. I used 4 metrics to calculate differences between these histograms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The difference between mean values &lt;/li&gt;
&lt;li&gt;The difference between std values&lt;/li&gt;
&lt;li&gt;Taxicab distance &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pQFXuLjO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0pr3036m2c1idv1qw6f4.png" alt="Alt Text"&gt;
here p, q - histograms, n - amount of bins in each.&lt;/li&gt;
&lt;li&gt;The square of the Euclidean distance&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After that I substract current value from previous to make peaks more distinctive for the threshold.&lt;/p&gt;

&lt;p&gt;Below you can find examples of these 4 distances in the same order on the same video. These pictures have scene changes (gray vertical lines) and final metric (blue line): &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HGQ_-sQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j6qfxqjrj1dh7kit87ub.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HGQ_-sQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j6qfxqjrj1dh7kit87ub.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bOWBadRt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zqiv2ojwgrc0p9ey8ck1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bOWBadRt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zqiv2ojwgrc0p9ey8ck1.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--z0or0ZHt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/a5kywh6sb1p99lwy19ut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--z0or0ZHt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/a5kywh6sb1p99lwy19ut.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jONNML1u--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/avov19jnrg7sa6zkaqll.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jONNML1u--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/avov19jnrg7sa6zkaqll.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It can be seen that the metric with the least distinctive peaks is the one with STD difference.&lt;/p&gt;

&lt;p&gt;Below you can find results for the first two distances:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mean values of histograms:
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fTOSq00j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5l3g1n3xy1t641dv8u5e.png" alt="Alt Text"&gt;
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--j7o3hl7Y--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gq4dikaytch2h6dvaipp.png" alt="Alt Text"&gt;
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9jyMLMLu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/80s919mif0xa3fqt1c9y.png" alt="Alt Text"&gt;
&lt;/li&gt;
&lt;li&gt;STD values of histograms:
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--onZG9dNQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/v3f2254v44666hus726q.png" alt="Alt Text"&gt;
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GiQahqxR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1cryb5lhaq583pbqhp83.png" alt="Alt Text"&gt;
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--13CJCuMM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gcpzyvhvn8n869ihmlb6.png" alt="Alt Text"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below you can see the results for these metrics of BBC Planet Earth dataset and manually marked up open source videos:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;F score on BBC Planet Earth&lt;/th&gt;
&lt;th&gt;F score on open source videos&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;#1(mean)&lt;/td&gt;
&lt;td&gt;0.8502&lt;/td&gt;
&lt;td&gt;0.6532&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;#2(std)&lt;/td&gt;
&lt;td&gt;0.6543&lt;/td&gt;
&lt;td&gt;0.5951&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;#3(Euclidian)&lt;/td&gt;
&lt;td&gt;0.7031&lt;/td&gt;
&lt;td&gt;0.6002&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;#4(Taxicab)&lt;/td&gt;
&lt;td&gt;0.7143&lt;/td&gt;
&lt;td&gt;0.6231&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The speed of the algorithm with this metric is ~0.87x speed of current version of the algorithm. &lt;/p&gt;

&lt;p&gt;Summary:&lt;br&gt;
The F score of new metrics is better than the current one. &lt;br&gt;
New metrics are a bit slower that the current metric.&lt;br&gt;&lt;br&gt;
But since they use different characteristics of the frame (motion vectors and color histograms) in combination they could enhance each other and increase the final F score.&lt;/p&gt;

&lt;p&gt;TO DO:&lt;br&gt;
&lt;del&gt;Precision recall curves for these metrics with different thresholds.&lt;/del&gt;&lt;br&gt;
Correlation with previous metric. Would it be better to combine these metrics or use them separately?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>July 26 - August 02 Weekly Status</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 30 Jul 2021 19:52:01 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/july-26-august-02-weekly-status-49gb</link>
      <guid>https://dev.to/aleksandrgushchin/july-26-august-02-weekly-status-49gb</guid>
      <description>&lt;p&gt;This week I implemented changes to the scene change algorithm and made &lt;a href="https://github.com/xiph/rav1e/pull/2765"&gt;pull request&lt;/a&gt; to github repository. Main analysis can be seen in &lt;a href="https://dev.to/aleksandrgushchin/threshold-experiments-for-scene-change-detector-2hbp"&gt;this blogpost&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I increased threshold for the fast version

&lt;ul&gt;
&lt;li&gt;It is increased F score by 0.1502 up to 0.7441&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;I applied numerical differentiation to metric values to make the peaks of the metric more distinguishable for the threshold&lt;/li&gt;
&lt;li&gt;I reduced threshold for the slow version

&lt;ul&gt;
&lt;li&gt;It is increased F score by 0.1056 up to 0.8081&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Overall, I improved F score of the fast version by 0.1454 and slow version by 0.1512&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Threshold Experiments for  scene change detector</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Tue, 27 Jul 2021 20:48:17 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/threshold-experiments-for-scene-change-detector-2hbp</link>
      <guid>https://dev.to/aleksandrgushchin/threshold-experiments-for-scene-change-detector-2hbp</guid>
      <description>&lt;h3&gt;
  
  
  Fast version's threshold
&lt;/h3&gt;

&lt;p&gt;I experimented with threshold of the fast version of the algorithm:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1DMUu75m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/a7pqtopbqtyxov48902z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1DMUu75m--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/a7pqtopbqtyxov48902z.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
This picture shows F score on testing dataset, X axis shows the number of the video, the Y axis - F score on this video. Lines represents different versions of the algorithm according to the legend.&lt;br&gt;
It can be seen that increasing the threshold value improves the F score of the algorithm and does not affect the processing speed.&lt;/p&gt;

&lt;p&gt;Here is the table of mean F score on datasets for different versions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;F score on BBC Planet Earth&lt;/th&gt;
&lt;th&gt;F score on open source videos&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 12 (current)&lt;/td&gt;
&lt;td&gt;0.5939&lt;/td&gt;
&lt;td&gt;0.6011&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 15&lt;/td&gt;
&lt;td&gt;0.6490&lt;/td&gt;
&lt;td&gt;0.6361&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 16&lt;/td&gt;
&lt;td&gt;0.6961&lt;/td&gt;
&lt;td&gt;0.6375&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 17&lt;/td&gt;
&lt;td&gt;0.7393&lt;/td&gt;
&lt;td&gt;0.6623&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 18&lt;/td&gt;
&lt;td&gt;0.7441&lt;/td&gt;
&lt;td&gt;0.6652&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast with thr = 20&lt;/td&gt;
&lt;td&gt;0.7795&lt;/td&gt;
&lt;td&gt;0.6244&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;0.7024&lt;/td&gt;
&lt;td&gt;0.5628&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slow improved&lt;/td&gt;
&lt;td&gt;0.8081&lt;/td&gt;
&lt;td&gt;0.6515&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I chose optimal value of the threshold as 18.&lt;/p&gt;

&lt;h3&gt;
  
  
  Slow version's threshold
&lt;/h3&gt;

&lt;p&gt;After experiments, the threshold for slow version was reduced by a factor of 2.2. Here you can see charts of performance (F score, precision and recall) of the algorith depending on the threshold reduction factor. These results were obtained with open-sourced videos from youtube.com and vimeo.com&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_rAJMCQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ttodfkd2sku61basvx57.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_rAJMCQY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ttodfkd2sku61basvx57.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jMyRq3lp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/747caqwhnz1q2em3yt2t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jMyRq3lp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/747caqwhnz1q2em3yt2t.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xqIF8fJl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ma8vfvjv6z2n8q4efgb3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xqIF8fJl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ma8vfvjv6z2n8q4efgb3.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here also the results on BBC Planet Earth datset:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--apQvJbWQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rva4r30vpwrbpl0temb4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--apQvJbWQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/rva4r30vpwrbpl0temb4.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5xW58o8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ilmo9bhgq93ce94qh9rd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5xW58o8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ilmo9bhgq93ce94qh9rd.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5JtkzDsS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7keabmbbpvj32i5t5mek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5JtkzDsS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7keabmbbpvj32i5t5mek.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
And precision-recall curve:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fMKVs_uG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emvqhgte8av30aywh7ok.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fMKVs_uG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emvqhgte8av30aywh7ok.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement of the metric
&lt;/h3&gt;

&lt;p&gt;To make the peaks of the metric more distinguishable for the threshold, I applied numerical differentiation to its values. Here is the example of the outcome on one of the videos:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--F7VlRXp9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vhm4qvkq2blgohnvsjmg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--F7VlRXp9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vhm4qvkq2blgohnvsjmg.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
The top picture shows original metric values, the bottom one shows metric after improvement. &lt;br&gt;
You can see that the peaks with the scene changes became more distinct, so the threshold is easier to tune.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>July 19 - July 26 Weekly Status / New scene change detector of the rav1e analysis </title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 23 Jul 2021 21:13:14 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/juky-19-july-26-weekly-status-new-scene-change-detector-of-the-rav1e-analysis-348</link>
      <guid>https://dev.to/aleksandrgushchin/juky-19-july-26-weekly-status-new-scene-change-detector-of-the-rav1e-analysis-348</guid>
      <description>&lt;p&gt;At the beginning of this week I started implementing changes to the scene change detector threshold. I made its values lower, and also changed the behavior when using max_keyint anf min_keyint options (before algorithm chose non-optimal frames, after algorithm chooses frames with highest metric value). Before I finished implementing other strategies for threshold, I found out about update in algorithm. &lt;/p&gt;

&lt;p&gt;After that I analyzed new version of the scene change detector of the rav1e. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;On 4k videos fast version of the algororithm works better than slow version.
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iKAEYmA1--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/07h57c57xako3qag5wvo.jpg" alt="Alt Text"&gt;
The x-axis shows the number of the video in the dataset, the y-axis shows F score of the algorithm. Blue line is the fast version, yellow is the slow version of the algorithm. It can be seen that fast version shows much better results. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;But&lt;/strong&gt; on BBC Planet Earth dataset slow version shows better results.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;F score&lt;/th&gt;
&lt;th&gt;Precision&lt;/th&gt;
&lt;th&gt;Recall&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;0.7024&lt;/td&gt;
&lt;td&gt;0.6452&lt;/td&gt;
&lt;td&gt;0.8013&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;0.5939&lt;/td&gt;
&lt;td&gt;0.4739&lt;/td&gt;
&lt;td&gt;0.7975&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can see from the table that fast version has similar recall score to slow version but worse precision. &lt;br&gt;
So I will try to improve the precision by increasing the base threshold. In fast version there is no adaptive threshold either so I will implement and experiment with it.&lt;br&gt;
Definition of F score, precision and recall can be seen in wikipedia. In short, the higher the precision, the less amount of false positive frames, the higher the recall, the less amount of misses by the algorithm. The F score performs as a balance metric between precision and recall.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;The example of how low the base threshold for fast version is.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6ez1D7PU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6ldhh1hfav49lma8ajhc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6ez1D7PU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6ldhh1hfav49lma8ajhc.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_4JBWWYa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qccrhskkchllv873gq0b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_4JBWWYa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qccrhskkchllv873gq0b.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
The blue line is the metric value, the orange one is the threshold. The vertical grey lines shows the scene changes. On the first picture the grey lines is the ground truth, on the second is predicted scene changes be the algorithm. As you can see on the second picture there are a lot of false positives. If the threshold value was aroud 20-24, the precision and F score would be a lot higher.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;On the other hand, for the slow version of the algorithm threshold is still too high. It can be seen from these two pictures.&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--I1RsB4Et--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2dqhsg6n2ff3b8ibma8x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--I1RsB4Et--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2dqhsg6n2ff3b8ibma8x.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PuTHUyk6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7hp49yoqvc2qztow8zah.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PuTHUyk6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7hp49yoqvc2qztow8zah.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
The concept is similar to the pictures above except for the version of the algorithm and used video. It can be senn that if the threshold was lower the algorithm would have higher F score. &lt;br&gt;
Examples with other videos:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--FFgL6Arf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/cjjjycpjdw3afkojddx5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--FFgL6Arf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/cjjjycpjdw3afkojddx5.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--U_S-hEbQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/az18u17p2g0j2j49c0vv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--U_S-hEbQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/az18u17p2g0j2j49c0vv.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
Third video:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YMdYz9Wz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/pammwfvoghjp4mt5mffh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YMdYz9Wz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/pammwfvoghjp4mt5mffh.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--48_9Hr4O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bpqd4fn3jcdym99qye3s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--48_9Hr4O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bpqd4fn3jcdym99qye3s.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
A similar problem is observed in the rest of the video.&lt;br&gt;
Another example:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SjGOLF-d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lpm4raohfs4jfo30w87n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SjGOLF-d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lpm4raohfs4jfo30w87n.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--EnRxEf3G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5up9jskj3ellq859z2ex.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--EnRxEf3G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5up9jskj3ellq859z2ex.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;On the average, the speed of the fast version is 1.3 times more that the speed of the slow version.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The new version of the algorithm is better than the old one by about 0.05-0.1 in terms of F score. Based on the results of the analysis, it can be improved even further.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>July 12 - July 19 Weekly Status</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 16 Jul 2021 20:32:35 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/july-12-june-19-weekly-status-1l97</link>
      <guid>https://dev.to/aleksandrgushchin/july-12-june-19-weekly-status-1l97</guid>
      <description>&lt;p&gt;This week I experimented with threshold according to one of my previous &lt;a href="https://dev.to/aleksandrgushchin/metric-visualization-and-analysis-590o"&gt;posts&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;First thing I've done is to lower threshold itself. After a series of experiments I chose to decrease it by 35%. &lt;/p&gt;

&lt;p&gt;You can see the results of lowering the theshold here:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LG5H7UT0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1powfxpbokvz82h1vpxh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LG5H7UT0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1powfxpbokvz82h1vpxh.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
To compare it with the previous results I post a picture from another &lt;a href="https://dev.to/aleksandrgushchin/results-of-current-algorithm-to-be-updated-o2f"&gt;blogpost&lt;/a&gt;:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Qa-ByJlH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54nsmgcyzvaby1jhnea8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Qa-ByJlH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54nsmgcyzvaby1jhnea8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The average F score on complete dataset was increased by ~0.096.&lt;/p&gt;

&lt;p&gt;Also I tested different strategies to adapt threshold.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If max_interval option is specified and during max_interval frames metric below threshold algorithm chooses not the last frame of this series but the one with the biggest metric.
You can see examples here:
Before:
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tZSmsVi5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/19nbpubopxxpn9lmcatb.png" alt="Alt Text"&gt;
After:
&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Q3YXx6iY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yp4m5ny8tt8pl8plo09s.png" alt="Alt Text"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Max_interval option here is 500. You can see that before algorithm chose exactly 500 frame, but after it chose the one with the biggest metric.&lt;/p&gt;

&lt;p&gt;Threshold does not take into account the following values of metrics:&lt;br&gt;
Algorithm now don't mark the current frame as scene change if metric value of the next frame is bigger than the current one. It helps to prevent the algorithm to mark a series of consecutive frames as scene change (example below):&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hzaBwkHl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o6epxue5vz79vw66y4ny.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hzaBwkHl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/o6epxue5vz79vw66y4ny.jpeg" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I didn't implement these changes in github repos yet and plan to do it early next week.&lt;/p&gt;

&lt;p&gt;The most difficult remaining problem is the metric itself. I will take a closer look into it to see if there is a way to correct it a little bit. After it I will start to implement a new metric to compensate the weaknesses of this metric.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>June 21 - June 28 Weekly Status
</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Sat, 26 Jun 2021 20:33:18 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/june-21-june-28-weekly-status-1a5c</link>
      <guid>https://dev.to/aleksandrgushchin/june-21-june-28-weekly-status-1a5c</guid>
      <description>&lt;p&gt;This week I tested the dependency of speed on resolution of the video. So far, results seem strange because downsampling from 4k to 2k didn’t affect much speed of the algorithm (approximately 1.2-1.3x faster). &lt;br&gt;
Then I measured the time complexity of downsampling alone. Inside the video frame reading function, I set the loop for 10000 iterations - at each of them the memory was allocated for a new frame of smaller size and the planes to which downsampling was applied were copied to the new frame valuable. It turned out to be about 200 iterations per second, which is way faster than 4k videos regular scenechange detector processing speed (2.5 frames per second). So the downsampling itself shouldn't affect the speed much. So, here's the problem to solve. &lt;br&gt;
I made an output of the metric of the algorithm to the json file and visualized it in a few charts. I made a &lt;a href="https://dev.to/aleksandrgushchin/metric-visualization-and-analysis-590o"&gt;blog post&lt;/a&gt; about it. Based on visualization I made some conclusions for future work, which are written at the end of the post.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Metric visualization and analysis</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 25 Jun 2021 20:09:47 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/metric-visualization-and-analysis-590o</link>
      <guid>https://dev.to/aleksandrgushchin/metric-visualization-and-analysis-590o</guid>
      <description>&lt;p&gt;The pictures in this blogpost will depict two charts each - they differ only in the gray vertical lines. In the upper chart the gray lines represent ground truth, in the lower chart the result of the algorithm. &lt;br&gt;
Blue line represent values of algorithm's metric on frames, orange - threshold.&lt;/p&gt;

&lt;p&gt;My first observation is that the threshold is often too high. It can illustrated in this picture:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--W4VL3bGU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z0ilnr3o957jsced5ogf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--W4VL3bGU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z0ilnr3o957jsced5ogf.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--T-6qNjQ_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kgzcl3kbz2dqzf8d1kj8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--T-6qNjQ_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kgzcl3kbz2dqzf8d1kj8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The second observation is that parameters (min interval and max interval) are often bad for the algorithm (especially the max parameter):&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--w6O6zLEy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7yiyeco04v7s5ufj9ogu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--w6O6zLEy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7yiyeco04v7s5ufj9ogu.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
You can see that algorithm marked exactly 500th frame as scene change although there are frames before it with higher metric.&lt;/p&gt;

&lt;p&gt;The third observation is that often the algorithm does not work well because of the threshold, but because of the metric:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TJ3ombUF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1u1r6qynth0nomjf9g5j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TJ3ombUF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1u1r6qynth0nomjf9g5j.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The fourth observation - the threshold in general does not behave optimally, it does not take into account the following values of metrics, because of this a lot of consecutive frames can be marked as scene changes:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uvnf34Yr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vftu2rxzo9br1rc9bd64.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uvnf34Yr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vftu2rxzo9br1rc9bd64.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wrQT1yGW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c381dgagx3cnwqptx8uo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wrQT1yGW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/c381dgagx3cnwqptx8uo.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
You can see that around 2000th frame the metric consistently grows, followed by a delayed growth of the threshold. Because of this, about 100 consecutive frames are marked as scene changes. Here is closer look on it:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0jp_s75U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ny9b1bt67z83v510vr64.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0jp_s75U--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ny9b1bt67z83v510vr64.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Summary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Threshold often is way too high for single spikes of metric values&lt;/li&gt;
&lt;li&gt;Max interval parameter can be improved, it better to mark previous frames with higher metric values&lt;/li&gt;
&lt;li&gt;Threshold does not take into account the following values of metrics&lt;/li&gt;
&lt;li&gt;Metric can be wrong (usually either the metric works well on the whole video or poorly on the whole video)&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>June 14 - June 21 Weekly Status</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Mon, 21 Jun 2021 19:32:15 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/june-14-june-21-weekly-status-14hm</link>
      <guid>https://dev.to/aleksandrgushchin/june-14-june-21-weekly-status-14hm</guid>
      <description>&lt;p&gt;This week I modified &lt;a href="https://github.com/AleksandrGushchin/av-scenechange"&gt;code&lt;/a&gt; to test the algorithm. Namely, I added command line arguments for downsampling video, outputting the result of the work as a json file, as well as calculating the speed of the algorithm. After that I tested the algorithm on my dataset. Based on the test results I made a blog post, where I put the data and charts. During the testing I changed the min_key, max_key command line parameters and made charts for each part of the dataset (documentaries, 4K and complex videos from YouTube) based on these data.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Results of current algorithm</title>
      <dc:creator>Aleksandr Gushchin</dc:creator>
      <pubDate>Fri, 18 Jun 2021 19:54:10 +0000</pubDate>
      <link>https://dev.to/aleksandrgushchin/results-of-current-algorithm-to-be-updated-o2f</link>
      <guid>https://dev.to/aleksandrgushchin/results-of-current-algorithm-to-be-updated-o2f</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3Gt3VKsr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1drn8ibhzukb2y48sl5a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3Gt3VKsr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1drn8ibhzukb2y48sl5a.png" alt="Alt"&gt;&lt;/a&gt;&lt;br&gt;
Results on different parts of algorithm. Each line represents different value of minimum interval value (&lt;em&gt;--min-scenecut command line parameter&lt;/em&gt;). &lt;br&gt;
It can be seen that performance of the algorithm on difficult youtube videos is worse than on documental films from BBC.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/AleksandrGushchin/av-scenechange"&gt;Here is  the code for testing&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;F score can be calculated using formula below:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--NIzSCSz_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h1mnstq08mnd7j64ttbn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NIzSCSz_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/h1mnstq08mnd7j64ttbn.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
where Precision and Recall are:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Wd61CkZN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kmtvqb93dp0wklcyz8x8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Wd61CkZN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/kmtvqb93dp0wklcyz8x8.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;tp&lt;/em&gt; is number of correctly detected scene changes&lt;br&gt;
&lt;em&gt;tn&lt;/em&gt; is number of correctly detected frames without scene changes&lt;br&gt;
&lt;em&gt;fp&lt;/em&gt; is number of false alarms&lt;br&gt;
&lt;em&gt;fn&lt;/em&gt; is number of missed scene changes&lt;/p&gt;

&lt;h5&gt;
  
  
  The average speed for the algorithm:
&lt;/h5&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Part&lt;/th&gt;
&lt;th&gt;FPS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;4K videos&lt;/td&gt;
&lt;td&gt;2.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BBC Dataset&lt;/td&gt;
&lt;td&gt;238&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Other  videos&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;TO DO:&lt;br&gt;
Dependency of F score and speed on resolution&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
