<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cristobal Silva</title>
    <description>The latest articles on DEV Community by Cristobal Silva (@canas).</description>
    <link>https://dev.to/canas</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F234514%2F32feb4df-0d1d-4977-bcd4-cb9e905c2677.png</url>
      <title>DEV Community: Cristobal Silva</title>
      <link>https://dev.to/canas</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/canas"/>
    <language>en</language>
    <item>
      <title>Get Jupyter Notebook diff with Github Actions</title>
      <dc:creator>Cristobal Silva</dc:creator>
      <pubDate>Fri, 18 Sep 2020 02:12:44 +0000</pubDate>
      <link>https://dev.to/canas/get-jupyter-notebook-diff-with-github-actions-4b90</link>
      <guid>https://dev.to/canas/get-jupyter-notebook-diff-with-github-actions-4b90</guid>
      <description>&lt;p&gt;Hello everyone! I'm Canas and this is my first post on DEV, hopefully it can be useful for people working in Data Science and Machine Learning :)&lt;/p&gt;




&lt;p&gt;One of the known challenges of working with notebooks is that version control is not ideal, with few tools available that actually deal with this. This motivated me to try and lessen the burden for some engineers that work with these files using Github Actions, leveraging on an existing open source tool called &lt;a href="https://github.com/jupyter/nbdime"&gt;ndime&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  My Workflow
&lt;/h3&gt;

&lt;p&gt;The general idea is to make the Github Action post a comment on the PR that contains the changes to any notebook with respect to the target branch.&lt;/p&gt;

&lt;p&gt;We will use the following existing actions to accomplish this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;checkout@v2&lt;/code&gt;, for fetching the code&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;actions/setup-python@v1&lt;/code&gt;, for installing python&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;peter-evans/create-or-update-comment@v1&lt;/code&gt;, to create a comment on the PR with &lt;code&gt;nbdiff&lt;/code&gt;'s output.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Submission Category:
&lt;/h3&gt;

&lt;p&gt;I guess this would fall in &lt;strong&gt;Maintainer Must-Haves&lt;/strong&gt;, since it will provide much better context when notebooks are submitted in a shared repository (e.g., for researching or persisting experiments).&lt;/p&gt;

&lt;h3&gt;
  
  
  Yaml File or Link to Code
&lt;/h3&gt;

&lt;p&gt;You can see a working implementation in &lt;a href="https://github.com/Canas/test-nbdiff-github-action/tree/change"&gt;this repository&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Canas/test-nbdiff-github-action/blob/change/.github/workflows/nbdiff.yaml"&gt;YAML file&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Canas/test-nbdiff-github-action/pull/3"&gt;PR with GH Action comment&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Generate notebook diff&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pull_request"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;check-diff&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v2&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;fetch-depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Fetch target branch&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;git fetch origin ${{ github.event.pull_request.base.ref }}:${{ github.event.pull_request.base.ref }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Setup Python&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.6"&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install requirements&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip3 install nbdime&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run and store diff&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;nbdiff ${{ github.event.pull_request.base.ref }} --no-color &amp;gt; diff.log&lt;/span&gt;
          &lt;span class="s"&gt;sed -i '1s/^/\`\`\`diff\n&amp;amp;/' diff.log&lt;/span&gt;
          &lt;span class="s"&gt;sed -i '$s/$/\n&amp;amp;\`\`\`/' diff.log&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Get comment body&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;get-comment-body&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;body=$(cat diff.log)&lt;/span&gt;
          &lt;span class="s"&gt;body="${body//'%'/'%25'}"&lt;/span&gt;
          &lt;span class="s"&gt;body="${body//$'\n'/'%0A'}"&lt;/span&gt;
          &lt;span class="s"&gt;body="${body//$'\r'/'%0D'}"&lt;/span&gt;
          &lt;span class="s"&gt;echo ::set-output name=body::$body&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create comment&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;peter-evans/create-or-update-comment@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;issue-number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.event.pull_request.number }}&lt;/span&gt;
          &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.get-comment-body.outputs.body }}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;In simple terms, we use &lt;code&gt;nbdiff&lt;/code&gt; to generate a file called &lt;code&gt;diff.log&lt;/code&gt;. After that, we use &lt;code&gt;sed&lt;/code&gt; to append and prepend the markdown enclosing characters. In the next step, we take &lt;code&gt;diff.log&lt;/code&gt; and do additional replacements that ensure that the PR comment &lt;a href="https://github.community/t/set-output-truncates-multiline-strings/16852/3"&gt;will not truncate newlines&lt;/a&gt;, which are then stored in the &lt;code&gt;body&lt;/code&gt; variable. Finally, we pass the &lt;code&gt;body&lt;/code&gt; variable to the &lt;code&gt;create-or-update-comment&lt;/code&gt; action which will take care of posting our formatted output in the PR.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Canas"&gt;
        Canas
      &lt;/a&gt; / &lt;a href="https://github.com/Canas/test-nbdiff-github-action"&gt;
        test-nbdiff-github-action
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;This repo is to try out a Github action that comments PRs with Jupyter Notebook diffs (vía &lt;a href="https://github.com/jupyter/nbdime"&gt;nbdime&lt;/a&gt;) if available. Sample is available in the only open PR.&lt;/p&gt;
&lt;p&gt;nbdiff.yaml&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;
&lt;pre&gt;&lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Generate notebook diff&lt;/span&gt;
&lt;span class="pl-ent"&gt;on&lt;/span&gt;: &lt;span class="pl-s"&gt;["pull_request"]&lt;/span&gt;
&lt;span class="pl-ent"&gt;jobs&lt;/span&gt;
  &lt;span class="pl-ent"&gt;check-diff&lt;/span&gt;
    &lt;span class="pl-ent"&gt;runs-on&lt;/span&gt;: &lt;span class="pl-s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="pl-ent"&gt;steps&lt;/span&gt;:
      - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/checkout@v2&lt;/span&gt;
        &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
          &lt;span class="pl-ent"&gt;fetch-depth&lt;/span&gt;: &lt;span class="pl-c1"&gt;0&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Fetch target branch&lt;/span&gt;
        &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;git fetch origin ${{ github.event.pull_request.base.ref }}:${{ github.event.pull_request.base.ref }}&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Setup Python&lt;/span&gt;
        &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/setup-python@v1&lt;/span&gt;
        &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
          &lt;span class="pl-ent"&gt;python-version&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;3.6&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Install requirements&lt;/span&gt;
        &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;pip3 install nbdime&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Run and store diff&lt;/span&gt;
        &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;          nbdiff ${{ github.event.pull_request.base.ref }} --no-color &amp;gt; diff.log&lt;/span&gt;
&lt;span class="pl-s"&gt;          sed -i '1s/^/```diff\n&amp;amp;/' diff.log&lt;/span&gt;
&lt;span class="pl-s"&gt;          sed -i '$s/$/\n&amp;amp;```/' diff.log&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Get comment body&lt;/span&gt;
        &lt;span class="pl-ent"&gt;id&lt;/span&gt;: &lt;span class="pl-s"&gt;get-comment-body&lt;/span&gt;
        &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;|&lt;/span&gt;
&lt;span class="pl-s"&gt;          body=$(cat diff.log)&lt;/span&gt;
&lt;span class="pl-s"&gt;          body="${body//'%'/'%25'}"&lt;/span&gt;
&lt;span class="pl-s"&gt;          body="${body//$'\n'/'%0A'}"&lt;/span&gt;
&lt;span class="pl-s"&gt;          body="${body//$'\r'/'%0D'}"&lt;/span&gt;
&lt;span class="pl-s"&gt;          echo ::set-output name=body::$body&lt;/span&gt;

      - &lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Create comment&lt;/span&gt;
        &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;peter-evans/create-or-update-comment@v1&lt;/span&gt;&lt;/pre&gt;…&lt;/div&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Canas/test-nbdiff-github-action"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;br&gt;&lt;br&gt;
*Be sure to checkout to the &lt;code&gt;change&lt;/code&gt; branch if you want to see the actual file!&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Resources / Info
&lt;/h3&gt;

&lt;p&gt;Actions used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/actions/checkout"&gt;checkout@v2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/actions/setup-python"&gt;actions/setup-python@v1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/marketplace/actions/create-or-update-comment"&gt;peter-evans/create-or-update-comment@v1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Libraries used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/jupyter/nbdime"&gt;nbdime&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Possible future work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test and benchmark on large size notebooks&lt;/li&gt;
&lt;li&gt;Look for a way to deploy the web version of nbdime, &lt;code&gt;nbdiff-web&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>actionshackathon</category>
      <category>github</category>
      <category>jupyter</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
