<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Thomas Lemberger</title>
    <description>The latest articles on DEV Community by Thomas Lemberger (@lembergerth).</description>
    <link>https://dev.to/lembergerth</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F175838%2F80c899d4-9027-4160-a4ae-1e1929d65744.jpeg</url>
      <title>DEV Community: Thomas Lemberger</title>
      <link>https://dev.to/lembergerth</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lembergerth"/>
    <language>en</language>
    <item>
      <title>Convey more data in your plots with range frames</title>
      <dc:creator>Thomas Lemberger</dc:creator>
      <pubDate>Mon, 03 Jul 2023 08:03:41 +0000</pubDate>
      <link>https://dev.to/lembergerth/convey-more-data-in-your-plots-with-range-frames-405h</link>
      <guid>https://dev.to/lembergerth/convey-more-data-in-your-plots-with-range-frames-405h</guid>
      <description>&lt;p&gt;Journalists, data scientists, researchers, software developers: We all have to work with data. And we all rely on the default layout of data plots from plotting libraries or programs like Microsoft Excel. Unfortunately, this layout can be vastly improved.&lt;/p&gt;

&lt;h2&gt;
  
  
  The conventional data plot
&lt;/h2&gt;

&lt;p&gt;A data plot consists of the following components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A frame. The frame holds the x-axis and y-axis and defines the borders of the plot.&lt;/li&gt;
&lt;li&gt;Labels. Labels describe the plot. A plot usually has at least the following labels: A description of the x-axis and a description of the y-axis; ticks labels that describe the value of certain locations on the x- and y-axis; graph labels that describe the data-set plotted (also known as legend).&lt;/li&gt;
&lt;li&gt;Data. Data can be plotted in different ways, for example as lines, points or bars.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Of these, the default frame of a plot often is poorly designed and can be significantly improved. Plotting libraries and office suites usually create a simple, rectangular frame that holds your data:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HDbxyMG4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bwqufoxkosmx35rb5dev.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HDbxyMG4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bwqufoxkosmx35rb5dev.png" alt="A default matplotlib scatter plot" width="511" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This type of frame holds little useful information. Concrete issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The upper and right borders contain no information, so they are unnecessary ink that distracts from the actual data.&lt;/li&gt;
&lt;li&gt;The frame does not tell you the start- and end-values per data dimension, but default, well-rounded milestones like 0 and 50. But the data actually goes below 0 in both dimensions.&lt;/li&gt;
&lt;li&gt;The frame has arbitrary proportions that may skew the data. On the x-axis, the difference between value 0 and 20 seems like a much larger difference than on the y-axis, because the plot does not use the same scale for the two axes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The range frame improves on these issues with easy-to-implement solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Range Frame
&lt;/h2&gt;

&lt;p&gt;Edward R. Tufte proposed range frames in his book „The Visual Display of Quantitative Information“. Range frames provide multiple benefits compared to the traditional four-sided frame, and pose no disadvantage.&lt;/p&gt;

&lt;p&gt;The basic idea:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Avoid chart junk (ink that holds no information).&lt;/li&gt;
&lt;li&gt;Let the frame tell the reader the start- and the end-values of the data.&lt;/li&gt;
&lt;li&gt;Put the data in the right proportion.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Avoid chart junk.&lt;/strong&gt; We get rid of the upper and right borders. This already makes the plot easier to look at.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RleUA0nR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/u066c9fkozw9zznrysf5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RleUA0nR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/u066c9fkozw9zznrysf5.png" alt="The default matplotlib scatter plot, but the upper and right axes are removed" width="511" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tell the Reader the Start and the End-Values of the Data.&lt;/strong&gt; Let the axes not span the whole way from one side of the plot to the other, but make them start at the lowest value and end at the highest value of the corresponding axis. Label these values in the plot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KJRH_k5V--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/eb8ouo2h0ray2tbjje7a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KJRH_k5V--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/eb8ouo2h0ray2tbjje7a.png" alt="The matplotlib scatter plot with a range frame" width="511" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This immediately shows the value range. It shows the reader that the data on the x-axis starts at 0 and goes to 49, and the data on the y-axis starts at -11.2 and goes to 58.6.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Put the data in the right proportion.&lt;/strong&gt; If the data dimensions have the same units (for example when you compare IQ levels or microservice runtimes), the axes should have the same data scale to not skew the presentation:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2pwZ9vqq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0fhivjnwn2h205wntjhu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2pwZ9vqq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0fhivjnwn2h205wntjhu.png" alt="The matplotlib scatter plot with a range frame and the same data scales on x- and y-axis" width="240" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;See how different the data looks now! If the data dimensions have different units, it is the author’s responsibility to find proportions between x- and y-axis that do not create a misleading impression on the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python Library
&lt;/h2&gt;

&lt;p&gt;For matplotlib and Python, I have created the library &lt;a href="https://pypi.org/project/matplotlib-tufte/"&gt;matplotlib-tufte&lt;/a&gt; to turn any default matplotlib plot into a range frame.&lt;/p&gt;

</description>
      <category>matplotlib</category>
      <category>datascience</category>
    </item>
    <item>
      <title>An Introduction to Using Jupyter Notebooks</title>
      <dc:creator>Thomas Lemberger</dc:creator>
      <pubDate>Thu, 07 May 2020 11:19:13 +0000</pubDate>
      <link>https://dev.to/lembergerth/an-introduction-to-using-jupyter-notebooks-4hig</link>
      <guid>https://dev.to/lembergerth/an-introduction-to-using-jupyter-notebooks-4hig</guid>
      <description>&lt;p&gt;&lt;a href="https://jupyter.org/" rel="noopener noreferrer"&gt;Jupyter Notebooks&lt;/a&gt; are a popular way to collaborate and share code, reports and data analyses. Unfortunately, their usage is fundamentally different from both the common website and conventional IDEs, so the first use can be confusing. To solve this, this article gives you a quick overview of the knowledge you need to successfully use an existing Notebook.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who is this for?
&lt;/h2&gt;

&lt;p&gt;This introduction is aimed at people with a minor technical background. I use some very basic Python code in my running example, but all of the concepts explained here are independent of that.&lt;/p&gt;

&lt;p&gt;I was never able to find an introduction to Jupyter Notebooks that jumps right to the usage. Instead, readers first have to plow through different sections on conceptual ideas, setup and creating content. You will definitely need this knowledge once you decide to create and share your own Notebooks, and I can highly recommend the &lt;a href="https://realpython.com/jupyter-notebook-introduction/" rel="noopener noreferrer"&gt;introduction to Jupyter on realpython.com&lt;/a&gt; and the &lt;a href="https://jupyter.org/documentation" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt;. But for now let’s focus on how to use that wonderful Notebook you found online.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Example
&lt;/h2&gt;

&lt;p&gt;We will use a &lt;a href="https://mybinder.org/v2/gl/lemberger%2Fjupyter-introduction/master?filepath=Usage.ipynb" rel="noopener noreferrer"&gt;small Jupyter Notebook&lt;/a&gt; I’ve created for this introduction. It is shared through &lt;a href="https://mybinder.org/" rel="noopener noreferrer"&gt;binder&lt;/a&gt;. Binder takes Jupyter Notebooks that are stored in a public Git repository and provides a web server and some resources to run them. If you are interested in publishing your own Notebooks with binder, I can recommend &lt;a href="https://medium.com/@leggomymego/lessons-learned-pushing-a-deep-learning-model-to-production-d6c6d198a7d8" rel="noopener noreferrer"&gt;this well-written guide on Medium&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The running example is self-contained and contains some code examples, so you can just use that, but this article will give you more details on usage and behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture of Jupyter Notebooks
&lt;/h2&gt;

&lt;p&gt;A Notebook consists of multiple cells. Each cell can be run individually and in arbitrary order, or you can run all cells one after the other.&lt;/p&gt;

&lt;p&gt;A cell is either text or code: If you run a text cell, the markdown contained in that cell is rendered and displayed. If you run a code cell, the code in that cell is executed on the current state of the Notebook. Any output of that cell is displayed right below the cell.&lt;/p&gt;

&lt;h2&gt;
  
  
  First: Running Code
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fg2o93ek33e5pyul6ct5l.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fg2o93ek33e5pyul6ct5l.gif" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can run a selected cell by hitting &lt;code&gt;Shift+Enter&lt;/code&gt; or clicking the Run ▶-Button in the toolbar.&lt;/p&gt;

&lt;p&gt;To signal that the cell was run, the Notebook will add a number &lt;code&gt;[1]:&lt;/code&gt; left of the cell. This number increases with each run and shows if and in which order cells were run. In addition, the Notebook selects the next cell. This makes it easy to rapidly execute multiple cells one after another: Just quickly press &lt;code&gt;Shift+Enter&lt;/code&gt; multiple times.&lt;/p&gt;

&lt;h2&gt;
  
  
  Second: Understanding the Notebook’s Global State
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp0uoqdp56vkdeifn5ob6.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fp0uoqdp56vkdeifn5ob6.gif" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each Notebook has a single state that is shared between all cells, called the kernel. Whenever you execute a cell, it modifies that state by running functions and setting variable values. Usually, the cells of a Notebook should be executed top-to-bottom, but that order has no influence on the program state: only the order of executions does! Since each cell works on the current global state, running the same cell multiple times may produce different results if its code depends on the global state.&lt;/p&gt;

&lt;p&gt;Take the example in the image above: the cell in the example executes the Python code below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;run_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;NameError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;run_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the first execution, variable &lt;code&gt;run_count&lt;/code&gt; does not exist and Python raises the &lt;code&gt;NameError&lt;/code&gt;: this sets &lt;code&gt;run_count = 0&lt;/code&gt; . In subsequent executions of the same cell, &lt;code&gt;run_count&lt;/code&gt; does now exist and the cell increases its value by 1. This small example shows how a single cell can depend on the global state and show different behavior across multiple executions — make sure to remember this when you play around with Notebooks. The &lt;a href="https://mybinder.org/v2/gl/lemberger%2Fjupyter-introduction/master?filepath=Usage.ipynb" rel="noopener noreferrer"&gt;running example&lt;/a&gt; contains one more example that illustrates the importance of execution order.&lt;/p&gt;

&lt;p&gt;To wrap this up, it is very important you understand two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;em&gt;The order of cell execution is important when you start to experiment with Notebooks.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;A single cell may be executed multiple times and will always work on the current global state.&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each time a code cell is run, the Notebook puts an increasing number in brackets left to the cell, for example &lt;code&gt;[4]:&lt;/code&gt;. This number shows you the order in which the cells were run and makes it easy to check whether everything was run in the intended order.&lt;/p&gt;

&lt;h2&gt;
  
  
  Third: Modification
&lt;/h2&gt;

&lt;p&gt;All cells in a Jupyter Notebook can be modified, and you can add new cells by pressing &lt;code&gt;Alt+Enter&lt;/code&gt;. So go ahead, just change some code and run it! Notebooks are all about exploring and experimenting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saving changes
&lt;/h3&gt;

&lt;p&gt;One last word of advice: When you use a service like binder, you use a temporary instance of a Jupyter Notebook. This is a good thing, because you can change whatever you want and it will have no effect on the original Notebook. But vice-versa your changes will not be saved on the web because your instance is deleted after some inactivity. To save your changes, you have to download the Notebook’s content through the menu: &lt;code&gt;File&lt;/code&gt;→&lt;code&gt;Download as&lt;/code&gt;. This gives you a selection of different formats: you can download the notebook as-is (&lt;code&gt;Notebook&lt;/code&gt;); you can select to download the content as a native Python file (&lt;code&gt;Python&lt;/code&gt;); or you can download the content as text in different formats (e.g., &lt;code&gt;AsciiDoc&lt;/code&gt;, &lt;code&gt;HTML&lt;/code&gt;, or &lt;code&gt;LaTex&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclaimer: “Jupyter” and the Jupyter logos are trademarks or registered trademarks of NumFOCUS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>jupyter</category>
      <category>collaboration</category>
      <category>python</category>
    </item>
  </channel>
</rss>
