<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Angus Comrie</title>
    <description>The latest articles on DEV Community by Angus Comrie (@veggiesaurus).</description>
    <link>https://dev.to/veggiesaurus</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F387250%2F0c56d2d6-e507-4009-9539-2838f31411b2.png</url>
      <title>DEV Community: Angus Comrie</title>
      <link>https://dev.to/veggiesaurus</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/veggiesaurus"/>
    <language>en</language>
    <item>
      <title>Contouring in CARTA: Part 1 - Contour generation</title>
      <dc:creator>Angus Comrie</dc:creator>
      <pubDate>Sat, 16 May 2020 20:20:39 +0000</pubDate>
      <link>https://dev.to/carta/contouring-in-carta-part-1-contour-generation-1a72</link>
      <guid>https://dev.to/carta/contouring-in-carta-part-1-contour-generation-1a72</guid>
      <description>&lt;h3&gt;
  
  
  Introduction &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Astronomers often want to calculate contours of images at a given set of contour levels, and overlay them either on top of the same image, or another image entirely. Other image visualisation packages such as &lt;a href="https://sites.google.com/cfa.harvard.edu/saoimageds9"&gt;SAOImage DS9&lt;/a&gt; and &lt;a href="https://www.atnf.csiro.au/computing/software/karma/"&gt;Karma kvis&lt;/a&gt; have good support for contour generation and overlays, so contour support was something our users had eagerly anticipated from our initial release.&lt;/p&gt;

&lt;p&gt;For standard raster images, CARTA's server-client model streams tiles of compressed (using the &lt;a href="https://github.com/LLNL/zfp"&gt;ZFP&lt;/a&gt; library), downsampled floating-point data, rather than sending rendered images. So my initial tests of contouring in CARTA were focused on performing the contouring directly in the browser, using a WebAssembly port of the widely-used contouring algorithm from &lt;a href="https://heasarc.gsfc.nasa.gov/ftools/fv"&gt;Fv&lt;/a&gt;, with a pre-contouring smoothing step (See Smoothing section below) performed on the GPU in WebGL, as shown in the following codepen:&lt;br&gt;
&lt;iframe height="600" src="https://codepen.io/veggiesaurus/embed/vQNmQW?height=600&amp;amp;default-tab=result&amp;amp;embed-version=2"&gt;
&lt;/iframe&gt;
&lt;br&gt;
GPU Accelerated Gaussian blurring is really fast: the main bottleneck is getting the data too and from the GPU using WebGL. The basic outline is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Send the raster data to the GPU as a floating point texture&lt;/li&gt;
&lt;li&gt;Use a WebGL shader to apply the 2D Gaussian filter&lt;/li&gt;
&lt;li&gt;Encode the filtered data into an RGBA texture, as WebGL limits the type of pixels that can be read back from the GPU&lt;/li&gt;
&lt;li&gt;Read the data data back from the GPU into a &lt;code&gt;Float32Array&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The codepen above has some additional functionality: It uses a two-pass filter with an intermediate texture, which ends up being 

&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(M)O(M)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 (where 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;MM&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the kernel width) for each filtered pixel, rather than the conventional single-pass 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(M2)O(M^2)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
.&lt;/p&gt;

&lt;p&gt;The approach described above is able to produce contours on the frontend without requiring any additional data, and is able to scale to very large server-side images, as the contouring is only performed on the cropped and down-sampled data already available on the frontend. However, there are two major drawbacks with this approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The contours are generated from cropped and down-sampled data, meaning that the data has already been reduced using a block average before the contouring step, unless the user zooms in on the image until reaching full resolution. As shown below, block averaging tends to smooth out interesting features, meaning the contours end up being misleading, rather than an accurate representation of the data.&lt;/li&gt;
&lt;li&gt;The contours are generated from existing data on the frontend, meaning that they can only be generated for images that have already been streamed to the frontend. If one wants to overlay the contours from image &lt;em&gt;A&lt;/em&gt; on top of image &lt;em&gt;B&lt;/em&gt;, the image data for both &lt;em&gt;A&lt;/em&gt; and &lt;em&gt;B&lt;/em&gt; will need to be streamed to the frontend. Users often need to overlay contours from a number of images onto a single raster image, so this approach would require significant network bandwdith.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the end, issue (1) was deemed to be too much of a drawback, as user rely on CARTA for accurate representations of their data. Instead, we opted for a hybrid contouring approach: Contours poly-lines are generated on the backend over a number of threads, before being compressed (covered in Part 2) and sent to the frontend. The frontend then decompresses the contours and renders them using WebGL (covered in Part 3). An outline of the contour generation pipeline is shown in Figure 1.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uFBE65Ch--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/776qeamayky2ebw389fk.png" alt="Outline Backend"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a&gt;&lt;/a&gt;Figure 1:&lt;/strong&gt; &lt;em&gt;Outline of contour generation and compression on backend&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Smoothing &lt;a&gt;&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Astronomy images are often rather noisy, and contours generated raw images are often full of noise. An example of this is shown in Figure 2. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BJXVRfk_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/83100dmixfy99orxptv2.png" alt="M51 contours"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a&gt;&lt;/a&gt;Figure 2:&lt;/strong&gt; &lt;em&gt;M51 contours with no smoothing&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The contouring algorithm is approximately 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(W×H×L)O(W \times H \times L)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;H&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;L&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, where 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;WW&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;W&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;HH&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;H&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 are the width and height of the image respectively, and 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;LL&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;L&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 is the number of contour levels to generate. In practise, performance depends heavily on the fraction of points near the contour levels. Poorly chosen contour levels near the noise threshold take a lot longer to generate, because there are so many points in the image at which pixels cross the level. &lt;/p&gt;

&lt;p&gt;Smoothing the image prior to contouring results in a lowered noise level, and contours can be generated to highlight real sources, rather than noise. Two smoothing modes are commonly used in this step: Block-averaging (each 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;N2N^2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;N&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 block of pixels is replaced by a value represting the entire block, usually the mean) and Gaussian smoothing (the image is convoluted with a 2D Gaussian kernel of size 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;M2M^2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
). Block-averaging reduces the size of the image to 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;W×H/N2W \times H / N^2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;H&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;N&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, while Gaussian smoothing retains the image size. As such, contours generated from block-averaged images tend to lack detail, as shown in Figure 3. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A2ON4S0O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qujqbj3ehd1znhvvu5fj.png" alt="M51 contours"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a&gt;&lt;/a&gt;Figure 3:&lt;/strong&gt; &lt;em&gt;M51 contours with block smoothing&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The reduction in image size means the contouring is now 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(W×H×L/N2)O(W \times H \times L / N^2)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;H&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;L&lt;/span&gt;&lt;span class="mord"&gt;/&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;N&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, while contouring of Gaussian smoothed images remains 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(W×H×L)O(W \times H \times L)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;W&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;H&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;L&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. In practice, both smoothing approaches also improve performance by lowering the noise level. Gaussian smoothing tends to produce much clearer contours, as shown in Figure 4. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pDpWARLk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/uktx351t8z9fl7et2hb3.png" alt="M51 contours"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a&gt;&lt;/a&gt;Figure 4:&lt;/strong&gt; &lt;em&gt;M51 contours with Gaussian smoothing&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;DS9 defaults to block smoothing, as Gaussian smoothing can be computationally expensive if not carefully optimised. For CARTA, we wanted to make Gaussian smoothing the default smoothing, as we felt it would produce better contours, and we were confident that we could improve smoothing performance over DS9. &lt;/p&gt;

&lt;p&gt;In order to do this, we chose to use a two-pass smoothing process, as Gaussian smoothing is separable, as long as the Gaussian kernel is symmetric. The image is first smoothed by a 1D kernel in the horizontal direction, and then smoothed in the vertical direction. This reduces the complexity per pixel from 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(M2)O(M^2)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 to 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;O(M)O(M)&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathdefault"&gt;O&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;M&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
. Normally, a two-pass smoothing process requires twice as much memory for the intermediary image, but we use a fixed buffer size of 200 MB and perform the smoothing on strips of the original image. Each strip is processed in parallel, with each thread smoothing a sub-strip. &lt;a href="https://en.wikipedia.org/wiki/Advanced_Vector_Extensions"&gt;AVX&lt;/a&gt; vector instructions (with a fallback to SSE) allow each thread to process four pixels at once. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_yyfaLIV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qo0qcptzy6mprswfttyl.png" alt="Gaussian blur speed"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a&gt;&lt;/a&gt;Figure 5:&lt;/strong&gt; &lt;em&gt;Gaussian blur speed in CARTA 1.3&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A 12-core &lt;a href="https://www.amd.com/en/products/cpu/amd-ryzen-9-3900x#product-specs"&gt;AMD Ryzen 3900x&lt;/a&gt; was used for benchmarking, with a 400 megapixel image and a kernel width of 7 pixels. As shown in Figure 5, performance scales almost linearly with the number of threads, up to about 8 or 9 threads, after which, performance plateaus at about 1500 MPix/s, as the bottleneck shifts to memory bandwidth. A quad-channel processor such as the &lt;a href="https://www.amd.com/en/products/cpu/amd-ryzen-threadripper-3960x#product-specs"&gt;Ryzen 3960x&lt;/a&gt; would allow for near-linear scaling up to a much higher number of threads. &lt;/p&gt;

&lt;p&gt;Comparing performance to that of DS9 with identical smoothing settings, CARTA 1.3's smoothing is up to 1000x faster 🚀🚀🚀&lt;/p&gt;

&lt;p&gt;In the rest of this series, I'll discuss how the contour poly-lines are efficiently compressed and finally rendered in the browser.&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>parallel</category>
      <category>compression</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
