<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dmitry Petrov</title>
    <description>The latest articles on DEV Community by Dmitry Petrov (@fullstackml).</description>
    <link>https://dev.to/fullstackml</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F173376%2F2ef2c12f-56e2-42d0-a02e-9233f9ee5083.png</url>
      <title>DEV Community: Dmitry Petrov</title>
      <link>https://dev.to/fullstackml</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/fullstackml"/>
    <language>en</language>
    <item>
      <title>Review our open-source project: Continuous Machine Learning ⭐ CML ⭐</title>
      <dc:creator>Dmitry Petrov</dc:creator>
      <pubDate>Sat, 11 Jul 2020 07:35:03 +0000</pubDate>
      <link>https://dev.to/fullstackml/review-our-open-source-project-continuous-machine-learning-cml-36pd</link>
      <guid>https://dev.to/fullstackml/review-our-open-source-project-continuous-machine-learning-cml-36pd</guid>
      <description>&lt;p&gt;I've been working on CML project in the last few months. The project idea is to &lt;strong&gt;automate machine learning&lt;/strong&gt; projects using CI/CD practices:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;📊 Visual reports in GitHub Pull Request or GitLab Merge Requests.&lt;/li&gt;
&lt;li&gt;💾 Transfer datasets in your CI runners for ML training.&lt;/li&gt;
&lt;li&gt;☁️ Auto-allocation of cloud CPU/GPU. AWS, Azure, GCP, Ali are supported.&lt;/li&gt;
&lt;/ol&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/iterative"&gt;
        iterative
      &lt;/a&gt; / &lt;a href="https://github.com/iterative/cml"&gt;
        cml
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      ♾️ CML - Continuous Machine Learning or CI/CD for ML
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;p&gt;
  &lt;a rel="noopener noreferrer" href="https://raw.githubusercontent.com/iterative/cml/master/imgs/title_strip_trim.png"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GIdhSlRT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://raw.githubusercontent.com/iterative/cml/master/imgs/title_strip_trim.png" width="400"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is CML?&lt;/strong&gt; Continuous Machine Learning (CML) is an open-source library for implementing continuous integration &amp;amp; delivery (CI/CD) in
machine learning projects. Use it to automate parts of your development workflow, including
model training and evaluation, comparing ML experiments across your project history, and
monitoring changing datasets.&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://raw.githubusercontent.com/iterative/cml/master/imgs/github_cloud_case_lessshadow.png"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lrDT0pGE--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://raw.githubusercontent.com/iterative/cml/master/imgs/github_cloud_case_lessshadow.png" alt=""&gt;&lt;/a&gt; &lt;em&gt;On every pull request, CML helps you automatically train and evaluate models, then generates a visual report with results and metrics. Above, an example report for a &lt;a href="https://github.com/iterative/cml_cloud_case"&gt;neural style transfer model&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;We built CML with these principles in mind:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://nvie.com/posts/a-successful-git-branching-model/" rel="nofollow"&gt;GitFlow&lt;/a&gt; for data science.&lt;/strong&gt; Use GitLab or GitHub to manage ML experiments, track who trained ML models or modified data and when. Codify data and models with &lt;a href="https://raw.githubusercontent.com/iterative/cml/master/#using-cml-with-dvc"&gt;DVC&lt;/a&gt; instead of pushing to a Git repo.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto reports for ML experiments.&lt;/strong&gt; Auto-generate reports with metrics and plots in each Git Pull Request. Rigorous engineering practices help your team make informed, data-driven decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No additional&lt;/strong&gt;…&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/iterative/cml"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;Today CML supports two CI/CD systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/features/actions"&gt;GitHub Actions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.gitlab.com/ee/ci/"&gt;GitLab CI/CD&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Automated visual ML report
&lt;/h3&gt;

&lt;p&gt;You can set up auto-generated reports in your GitHub Pull Requests (or GitLab Merge Requests):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HTL2dYtL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dvc.org/static/4abfb3a481ef05b3f8a0140ede1bda90/c5adc/cml-report-metrics.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HTL2dYtL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dvc.org/static/4abfb3a481ef05b3f8a0140ede1bda90/c5adc/cml-report-metrics.png" alt="GitHub PR Comment"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The report is generated by CML commands (&lt;code&gt;cml-&lt;/code&gt; prefix) from GitHub Actions scripts (or GitLab CI/CD script). GitHub Action example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Creat a file `.github/workflows/cml.yaml`&lt;/span&gt;
name: train-my-model

on: &lt;span class="o"&gt;[&lt;/span&gt;push]

&lt;span class="nb"&gt;jobs&lt;/span&gt;:
  run:
    runs-on: &lt;span class="o"&gt;[&lt;/span&gt;ubuntu-latest]
    container: docker://dvcorg/cml-py3:latest

    steps:
      - uses: actions/checkout@v2
      - name: cml_run
        &lt;span class="nb"&gt;env&lt;/span&gt;:
          repo_token: &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="p"&gt;{ secrets.GITHUB_TOKEN &lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;

        run: |
          pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
          python train.py

          &lt;span class="nb"&gt;cat &lt;/span&gt;metrics.txt &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; report.md
          cml-publish confusion_matrix.png &lt;span class="nt"&gt;--md&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; report.md
          cml-send-comment report.md
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;After pushing your code changes in GitHub the workflow code runs and generates the report as a &lt;strong&gt;comment in Pull Request&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vi train.py
&lt;span class="nv"&gt;$ &lt;/span&gt;git add train.py
&lt;span class="nv"&gt;$ &lt;/span&gt;git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s1"&gt;'Increase depth to 7'&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;git push
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Auto-allocate GPU and transfer datasets
&lt;/h3&gt;

&lt;p&gt;You can find GPU examples and data transferring examples in the website &lt;a href="http://cml.dev/"&gt;http://cml.dev/&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical details
&lt;/h3&gt;

&lt;p&gt;The code is written in JavaScript: &lt;a href="https://www.npmjs.com/package/@dvcorg/cml"&gt;https://www.npmjs.com/package/@dvcorg/cml&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And packed to a docker image that was used from the workflow: &lt;a href="https://hub.docker.com/repository/docker/dvcorg/cml-py3"&gt;https://hub.docker.com/repository/docker/dvcorg/cml-py3&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;I'd love to hear your feedback and what next you'd like to automate in your ML projects.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>githunt</category>
    </item>
  </channel>
</rss>
