<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Min</title>
    <description>The latest articles on DEV Community by Min (@minchulkim87).</description>
    <link>https://dev.to/minchulkim87</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F304691%2Fcdd644a5-27d7-4fce-bf3f-7235af3ed8fc.png</url>
      <title>DEV Community: Min</title>
      <link>https://dev.to/minchulkim87</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/minchulkim87"/>
    <language>en</language>
    <item>
      <title>Machine Learning Certifications</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Thu, 26 Jan 2023 00:08:19 +0000</pubDate>
      <link>https://dev.to/minchulkim87/machine-learning-certifications-5gnn</link>
      <guid>https://dev.to/minchulkim87/machine-learning-certifications-5gnn</guid>
      <description>&lt;p&gt;I had some time over the end of year / new year to go over some machine learning concepts and cloud technologies. I have used cloud computing here and there - whether it was to spin up a Minecraft server to play with friends, develop a webpage to help with tutoring, or to develop AI applications for work - but I wanted to get an overview of what the three cloud providers had to offer. I also got to fill in some of my many gaps in my knowledge in this vast field of data science and machine learning. Overall, it was a good learning experience. As a side effect, I now have some new badges.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjm4uzfl2986xs4z192zy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjm4uzfl2986xs4z192zy.png" alt="Machine Learning Certifications" width="800" height="256"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While there were some differences in how the three major cloud provides approach machine learning, their certifications focused on similar ideas: (a) taking models into production with scalability and reliability, (b) how to tackle issues such as privacy, bias, and explainability, and (c) how and when to leverage existing AI/ML solutions (cloud specific or through AutoML). In short, MLOps seemed to be a common theme amongst the three.&lt;/p&gt;

&lt;p&gt;MLOps is said to be a combination of machine learning, devops, and data engineering. So it can be very challenging to learn. So here are some recommendations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisite Knowledge
&lt;/h2&gt;

&lt;p&gt;These certifications are aimed at professionals who already have some knowledge and experience in machine learning. Maybe you are a software engineer who is increasingly having to incorporate ML solutions. Perhaps you are a statistician dipping your toes in the cloud. Or you are a manager / consultant working in the AI field.&lt;/p&gt;

&lt;p&gt;In any case, if you are new to programming and/or machine learning - you will need to start there.&lt;/p&gt;

&lt;p&gt;🎓 &lt;a href="https://www.edx.org/professional-certificate/harvardx-computer-science-for-web-programming" rel="noopener noreferrer"&gt;&lt;strong&gt;Computer Science for Web Programming Professional Certificate&lt;/strong&gt;&lt;/a&gt; by Harvard through &lt;a href="//edx.org"&gt;edX&lt;/a&gt; is a great introduction to computer science. It touches on Python, JavaScript, application development, databases and SQL, how the internet works, APIs, and a bit about security.&lt;/p&gt;

&lt;p&gt;🎓 &lt;a href="https://www.coursera.org/specializations/machine-learning-introduction" rel="noopener noreferrer"&gt;&lt;strong&gt;Machine Learning Specialization&lt;/strong&gt;&lt;/a&gt; by Stanford through &lt;a href="//coursera.org"&gt;Coursera&lt;/a&gt; is a great introduction to machine learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Books
&lt;/h2&gt;

&lt;p&gt;📖 &lt;em&gt;An Introduction to Statistical Learning&lt;/em&gt; by James, Witten, Hastie, and Tibshirani. It is never a bad idea to re-read this book for the "classical" machine learning models.&lt;/p&gt;

&lt;p&gt;📖 &lt;em&gt;Deep Learning with Python&lt;/em&gt; by Francois Chollet is another great introductory book, which covers deep learning models as well as working with images and text.&lt;/p&gt;

&lt;p&gt;📖 &lt;em&gt;Machine Learning Engineering&lt;/em&gt; by Andriy Burkov is a good book that touches on many of the MLOps topics that all machine learning engineers should know.&lt;/p&gt;

&lt;p&gt;📖 &lt;em&gt;Fundamentals of Data Engineering&lt;/em&gt; by Reis and Housley provides a very good overview of all things data engineering in a vendor/product-independent way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Courses
&lt;/h2&gt;

&lt;p&gt;🎓 &lt;a href="https://www.udacity.com/course/machine-learning-dev-ops-engineer-nanodegree--nd0821" rel="noopener noreferrer"&gt;&lt;strong&gt;Machine Learning DevOps Engineer Nanodegree&lt;/strong&gt;&lt;/a&gt; by &lt;a href="https://www.udacity.com/" rel="noopener noreferrer"&gt;Udacity&lt;/a&gt; covered a lot of ground (including tools such as MLflow and FastAPI, and concepts such as writing clean code and automated testing). Despite its cost, Udacity is also a great platform. Their nanodegrees take a project-based learning approach. Very hands-on and a very industry-informed curation of projects. You won't pass unless your project meets all criteria - so the nanodegrees can be quite time consuming, but their feedback system is amazing, and you can resubmit your projects as many times as you need to keep improving.&lt;/p&gt;

&lt;p&gt;🎓 &lt;a href="https://cloudacademy.com/" rel="noopener noreferrer"&gt;Cloud Academy&lt;/a&gt; was a great learning platform - especially with the hands-on labs where they provide you temporary access to AWS, Azure, and Google Cloud, so that you can play around with the services you are learning about, rather than just reading about them. They have learning paths (which include video lessons, hands-on labs, quizzes, and practice exams) for the data engineering and machine learning certifications (and plenty more) on all three major cloud platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practice Exams
&lt;/h2&gt;

&lt;p&gt;Avoid using "brain dumps" (where people try to remember exam questions and share them). Not only is it cheating, but there is no quality control - the questions would be worded incorrectly, or the "answers" will be wrong almost all of the time. The "best" or "official" practice exams are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Microsoft Azure: &lt;a href="https://www.measureup.com/" rel="noopener noreferrer"&gt;Measure Up&lt;/a&gt;. Amazing quality, basically the same coverage, style, and difficulty as the real thing.&lt;/li&gt;
&lt;li&gt;AWS: &lt;a href="https://tutorialsdojo.com/" rel="noopener noreferrer"&gt;Tutorials Dojo&lt;/a&gt;. Good explanations, and similar level of difficulty as the real thing.&lt;/li&gt;
&lt;li&gt;Google Cloud: This one doesn't have an "official" partner. But the closest thing I could find was &lt;a href="https://www.whizlabs.com/" rel="noopener noreferrer"&gt;Whizlabs&lt;/a&gt;. Varying quality from question to question, but reasonable quality. I can confirm that these were not brain dumps. And I can confirm that the questions were mostly useful to study.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>machinelearning</category>
      <category>aws</category>
      <category>azure</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>JavaScript for Good: Unfinished</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Sun, 26 Apr 2020 23:50:13 +0000</pubDate>
      <link>https://dev.to/minchulkim87/javascript-for-good-unfinished-1pe1</link>
      <guid>https://dev.to/minchulkim87/javascript-for-good-unfinished-1pe1</guid>
      <description>&lt;p&gt;2020 is the year that I finally decided to learn web development and I started the Full Stack Web Developer Nanodegree with Udacity. But this wasn't the first time I had tinkered with the web stuff. Doing this nanodegree and the pandemic situation made me think about my previous adventures that involved JavaScript. In a way, I think the takeaway from this post is that programming can be useful even for people who are not developers, but also for developers to think about any past unfinished projects they'd like to revisit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The background
&lt;/h2&gt;

&lt;p&gt;I was deep in my quarter-life crisis and I wanted to do something that mattered. So I quit my PhD degree in Physics and became a school teacher in a disadvantaged school. I am no longer a teacher, nor have I returned to my PhD (perhaps a story for another time), but I had observed the vast gap that exists between regular schools and disadvantaged schools. Online, or computer-based, learning that I thought to be the future, did not seem like a viable option for many of these schools that struggled to afford textbooks let alone functioning computers. The KhanAcademy that I had loved and used for tutoring, was not suitable for classrooms like this. Managing students through limited resources, on top of having to keep the students from being distracted by the entire internet, as well as dealing with lost passwords and other unnecessary troubleshooting, was challenging to say the least.&lt;/p&gt;

&lt;h2&gt;
  
  
  The project
&lt;/h2&gt;

&lt;p&gt;Fixing education is not something that I could have ever tackled alone. But a very small part of the problem that I had faced was a solvable one. What I needed was a free, no login, printable, educational resource online. I was a mathematics and science teacher, so (once I had left teaching) I started working on a mathematics worksheet generator (among other unfinished projects).&lt;/p&gt;

&lt;p&gt;The solution was static webpages that would generate problems using JavaScript, HTML, and CSS, every time the page was reloaded. I wanted to make it entirely browser-based so that nothing had to be installed, internet connection wasn't required, and that there were no logins required. I had to design it so that when I hit print, the page prints nicely.&lt;/p&gt;

&lt;p&gt;This was back in 2014, and I had not returned to it since, other than to use it to print off some worksheets from time to time as I tutored. But I decided to host it on Firebase so that I could simply give the link to my tutoring students to access. Here is the webpage as it was developed back in 2014, with the navigation page added using Bulma.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ac-qbank.web.app/"&gt;Mathematics Test Generator&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some of the programming was a little challenging, as I developed everything from scratch, including the drawing of the graphs and the shapes - in a way that was both randomly generated and randomly oriented. The questions also had to be random, but in such a way that the answers to the questions would be calculatable without calculators. The answers needed to be simplified (with surds) and factorised. It was fun to work on.&lt;/p&gt;

&lt;h2&gt;
  
  
  The issues
&lt;/h2&gt;

&lt;p&gt;So why did I discontinue my work on it? I was unemployed. This, and a few other projects, were all in the desire to give something to the community. Close to three years of unemployment was hard. A PhD dropout without anything to show for work experience for the recent 3 years at the time meant that I could not get a job, even after over 100 applications. Once I had my first very minimum-pay job, I was overworking to compensate for my late start.&lt;/p&gt;

&lt;p&gt;Lack of money, lack of time, and ultimately a lack in my belief that this project was worthy of pushing further, were my excuses for not finishing this project.&lt;/p&gt;

&lt;p&gt;And I can't realistically see myself returning to it. Not only because I am a much busier person, but also because of my lack of development skills back in 2014. My code was spaghetti - it was crude, unorganised, messy... disgusting to look at. I'd have to start from scratch.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what
&lt;/h2&gt;

&lt;p&gt;With a huge number of students studying from home and online, given the COVID-19 situation, I wondered how things may have turned out had I continued to work on some sort of educational website for the past 5 years. Maybe, just maybe, I could have done something useful in my 20's.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do you have an unfinished project? Why did you abandon it? Would you ever return to it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you are interested in getting started in web development (which is also a good introduction to programming in general), then this other post might be useful to you.&lt;/p&gt;


&lt;div class="ltag__link"&gt;
  &lt;a href="/minchulkim87" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R8UcDO6M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://res.cloudinary.com/practicaldev/image/fetch/s--8Qm3jBxu--/c_fill%2Cf_auto%2Cfl_progressive%2Ch_150%2Cq_auto%2Cw_150/https://dev-to-uploads.s3.amazonaws.com/uploads/user/profile_image/304691/cdd644a5-27d7-4fce-bf3f-7235af3ed8fc.png" alt="minchulkim87"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/minchulkim87/learn-web-development-for-free-56of" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Learn Web Development for Free&lt;/h2&gt;
      &lt;h3&gt;Min ・ Apr 9 '20 ・ 8 min read&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#javascript&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#python&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#webdev&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#sql&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>watercooler</category>
      <category>javascript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Learn Web Development for Free</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Thu, 09 Apr 2020 07:26:30 +0000</pubDate>
      <link>https://dev.to/minchulkim87/learn-web-development-for-free-56of</link>
      <guid>https://dev.to/minchulkim87/learn-web-development-for-free-56of</guid>
      <description>&lt;h2&gt;
  
  
  Motivation and Background
&lt;/h2&gt;

&lt;p&gt;My brother has been developing an interest in programming recently. Data science, artificial intelligence, web development, ..., you know, all the cool stuff. Eventually, he built up enough motivation to start learning web development and asked me how he might get started. I'm no expert in web development (I'm more of a data science guy), but as a former teacher, I was keen on putting together a beginner-friendly "curriculum" of sorts.&lt;/p&gt;

&lt;p&gt;He had touched HTML and maybe a programming language before. But that was close to two decades ago, and my brother hadn't really mastered it then nor has he done any programming since. So I can treat him as a complete beginner. There were a few things I had to keep in mind:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Time&lt;/strong&gt;: My brother is working full time, and has to take care of his baby. I have to take a minimalistic approach. Harvard's CS50 on edX.org is amazing - they take complete beginners through the basics of computer science and include a web development course using Python. But it does go through a lot using the C programming language before they get to the web stuff. Besides, unless you sign up, you often cannot follow along what they code in their lectures because they use CS50 specific tools sometimes. I had to mix and match different resources that fit well together.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Convenience&lt;/strong&gt;: My brother is keen to study in between his busy work/life. I want it to be browser-based as long as possible. Installing tools and setting up environments can be tricky for beginners, and his work computer runs Windows while his home computer is a Mac. It is better for beginners to get started on the code straight away, as trying to mess with tools can be daunting and off-putting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pedagogy&lt;/strong&gt;: I don't want to sound too much like an ex-teacher, but there are progressions that make more sense in terms of learning. Less complex ideas should be introduced first before more complex ideas are introduced. Sounds obvious, but some courses out there are focussed on the job-readiness, and therefore try to get to the most popular tools as soon as possible. For example, out of the front-end SPA technologies, Svelte appears to be the easiest choice to get started, but most courses offer React.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: There is a lot to be said about learning from video and learning from reading. I won't go into it here, but suffice it to say, that at some point, all developers need to learn how to read the documentation. But not all documentation is created equally. Following the docs for FastAPI is far easier than following those of Flask, although there are far more tutorials made about Flask. The choices I present needed to consider how clear something was explained through a mixture of video and documents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language&lt;/strong&gt;: Obvious necessities are HTML, CSS, JavaScript, and SQL. That, in theory, should be enough, with Node.js and Express.js pretty much covering the backend part of web development. But my brother initially was and still is, interested in data science as well. So I thought Python would be a good thing to start to get used to now.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Oh, and I wanted to collect completely FREE resources.&lt;/p&gt;

&lt;p&gt;The result of all these considerations was a curriculum that I have named "Web Dev for Bro". If you are in a similar situation, getting started in Web Development without any programming experience, and don't have much time, you may find this useful too. Here it is:&lt;/p&gt;

&lt;h1&gt;
  
  
  1 Introduction to Programming
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1.1 HTML/CSS
&lt;/h2&gt;

&lt;p&gt;Just do the basic tutorials. No need to do all of them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.w3schools.com/html/default.asp"&gt;HTML Tutorial&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.w3schools.com/css/default.asp"&gt;CSS Tutorial&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  1.2 JavaScript
&lt;/h2&gt;

&lt;p&gt;Just do the basic tutorials. No need to do all of them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.w3schools.com/js/default.asp"&gt;JavaScript Tutorial&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  1.3 Python
&lt;/h2&gt;

&lt;p&gt;Just do the basic tutorials. No need to do all of them.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.w3schools.com/python/default.asp"&gt;Python Tutorial&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  1.4 SQL
&lt;/h2&gt;

&lt;p&gt;Just to the basic tutorials and the SQL Database tutorials.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.w3schools.com/sql/default.asp"&gt;SQL Tutorial&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  2 Introduction to Web Development
&lt;/h1&gt;

&lt;h2&gt;
  
  
  2.1 Concepts
&lt;/h2&gt;

&lt;p&gt;Only need to watch. No need to follow along. Just try to get used to the concepts at this point.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/playlist?list=PLhQjrBD2T382xHP1dYqfF6kRqL7xBTQNJ"&gt;CS50 2019 - Web Track - YouTube&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  2.2 Tools and Setup
&lt;/h2&gt;

&lt;p&gt;These tutorials are more about having the necessary tools to be a JavaScript and Python developer. Follow along with the installation and familiarise yourself with the tools. &lt;strong&gt;You don't need to be fully comfortable with these tools yet&lt;/strong&gt;. Simply installing these tools is enough at this point. All of the subsequent courses will provide some guidance on how to use these tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  VS Code
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=fnPhJHN0jTE"&gt;Visual Studio Code Intro &amp;amp; Setup&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  JavaScript and npm
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=jHDhaSSKmB0"&gt;NPM Crash Course&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Python and pipenv
&lt;/h3&gt;

&lt;p&gt;There are many ways of managing the environment for python projects. Pipenv should be the "best", but some of the tutorials, later on, will use virtualenv. Both are fine, and those tutorials will show you how to use virtualenv. On your own projects, try to use pipenv.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=6Qmnh5C4Pmo"&gt;Pipenv Crash Course&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Git/GitHub
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=SWYqp7iY_Tc"&gt;Git &amp;amp; GitHub Crash Course For Beginners&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  SQLite and PostgreSQL
&lt;/h3&gt;

&lt;p&gt;There are many databases. Some of the most popular are PostgreSQL, MySQL, and MongoDB. For most cases, PostgreSQL is the best option. All of these databases require a database "server". This can be cumbersome, so developers often use SQLite during development and then switch over to PostgreSQL during production. So it is good to know a bit of both. The SQL language is mostly the same so there is nothing "new" to learn from what was covered in W3School, but here are some tutorials on how to work with SQLite and PostgreSQL.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=girsuXz0yA8"&gt;Sqlite 3 Python Tutorial&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=fZQI7nBu32M"&gt;PostgreSQL - Installation &amp;amp; Overview&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  2.3 Practice
&lt;/h2&gt;

&lt;p&gt;Follow along with this introductory tutorial. There will be a concept that hasn't been introduced before called ORM, which will use a tool called SQLAlchemy. Don't freak out, just follow along as an introduction for now. It will be covered in the future. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/Z1RJmh_OqeA"&gt;Learn Flask for Python - Full Tutorial&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🍎 &lt;strong&gt;By this point, you should have a basic idea of how to make small and simple web apps.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  3 Web Development Fundamentals
&lt;/h1&gt;

&lt;h2&gt;
  
  
  3.1 Concepts
&lt;/h2&gt;

&lt;p&gt;Things about to get more advanced from here on. Watch the following video lectures. There is no need to follow along, but you may need to re-watch some lectures a couple of times to really absorb the concepts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/playlist?list=PLhQjrBD2T382hIW-IsOVuXP1uMzEvmcE5"&gt;CS50's Web Programming with Python and JavaScript - YouTube&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3.2 Practice
&lt;/h2&gt;

&lt;p&gt;Follow along the following tutorials to get some practice of backend and frontend web development. Some of these tutorials may be similar, but it is worth doing all of them for practice. Each tutorial has a slightly different focus, so it is well worth going through all of them. Note that you'll be moving towards separating the frontend from the backend. The backend provides the API and the frontend consumes the API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Backend
&lt;/h3&gt;

&lt;p&gt;Follow along with this tutorial to built a full web app using Flask.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/3mwFC4SHY-Y"&gt;Python Flask Tutorial for Beginners - Full Course in 3 hours (2020)&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;While it is possible to build full apps using Flask and its templates, modern apps often separate the backend from the frontend. This means that Flask will only be used to provide an API that the frontend part of the app can use.&lt;/p&gt;

&lt;p&gt;One tool that can help you test these APIs as you build the backend before building the frontend is called Postman. As with other tool videos, you don't need to remember everything yet, but install postman and familiarise yourself.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=Iq7eh6DhN6M"&gt;Postman API Crash Course for Beginners [2020] - Learn Postman in 1 hour&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now follow along with this tutorial to build an API.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/PTZiDnuC86g"&gt;REST API With Flask &amp;amp; SQL Alchemy&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Frontend
&lt;/h3&gt;

&lt;p&gt;There are many frontend frameworks, one of the easiest to get started is Svelte. Follow along with this tutorial to get a feel for it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/Bfi96LUebXo"&gt;Svelte 3 - Quickstart Tutorial (Getting Started With Svelte.js)&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You will feel like you need to go back and revise JavaScript. This is perfectly normal and there is no harm in going back to basics as you need. But just to show how amazingly simple Svelte will be once you get it, here is a 3 minute tutorial that shows you how to build a todo app in 15 lines of code.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=1zWEAluGy9k"&gt;Intro to Svelte - Todo app in 3 minutes and 15 lines of code&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And here is a longer one that you can follow along with to build a to do app with a bit more complexity. Also, towards the end of this tutorial there is an introduction to how to connect such frontend apps to the backend API.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://youtu.be/0uTX5GfmhTo"&gt;Svelte v3 - Basics - Todo App&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;🔥 &lt;strong&gt;By this point, you should have a pretty good idea of how to make simple web apps.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  4 Master Web Development
&lt;/h1&gt;

&lt;p&gt;You are already ready to start building hobby web apps. To become better, you need to become pretty good with at least one frontend framework (you've met Svelte) and at least one backend framework (you've met Flask and Django). In addition, you will need to get comfortable with deploying web apps online (Heroku, for example). Underlying all this is being good at Python and JavaScript.&lt;/p&gt;

&lt;h2&gt;
  
  
  4.1 Learn a Frontend "Framework"
&lt;/h2&gt;

&lt;p&gt;Many choices here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Svelte&lt;/li&gt;
&lt;li&gt;Vue&lt;/li&gt;
&lt;li&gt;Angular&lt;/li&gt;
&lt;li&gt;React&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pick one and master it. &lt;strong&gt;Svelte&lt;/strong&gt; is the most elegant and easiest to learn. It is also the best introduction to frontend technologies. The official documentation has a great tutorial.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://svelte.dev/tutorial/basics"&gt;Svelte&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To get a job in frontend development, you need to know the popular tools. The next thing I would learn would be &lt;strong&gt;Vue&lt;/strong&gt;. This is optional, and I would try and make a few web apps using Svelte first.&lt;/p&gt;

&lt;h2&gt;
  
  
  4.2 Learn a Backend "Framework"
&lt;/h2&gt;

&lt;p&gt;Many choices here too:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flask&lt;/li&gt;
&lt;li&gt;Django&lt;/li&gt;
&lt;li&gt;FastAPI&lt;/li&gt;
&lt;li&gt;Express&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As with the frontend case, pick one and master it. &lt;strong&gt;FastAPI&lt;/strong&gt; is the most modern python framework. The official documentation has a great tutorial. In fact, FastAPI has the best official tutorial that covers a lot of concepts including things like security. However, it does not have many video tutorials on youtube or anywhere else yet. The framework design is very similar to Flask, so the knowledge should be transferrable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://fastapi.tiangolo.com/tutorial/"&gt;FastAPI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the Python world, the most popular frameworks are &lt;strong&gt;Flask&lt;/strong&gt; and &lt;strong&gt;Django&lt;/strong&gt;. But the backend development market is fragmented into Python, JavaScript, Ruby, PHP, Java Scala, Go, Rust, and many more languages. I would stick with Python and JavaScript. For JavaScript, &lt;strong&gt;Express&lt;/strong&gt; is the most popular backend framework. The next thing I would learn is &lt;strong&gt;Express&lt;/strong&gt;. Again, this is optional, and I would try and make a few web apps using FastAPI first.&lt;/p&gt;

&lt;h2&gt;
  
  
  4.3 Figure out how to deploy web apps
&lt;/h2&gt;

&lt;p&gt;There are many options here and things keep changing all the time. Some starting points are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;li&gt;Heroku&lt;/li&gt;
&lt;li&gt;Firebase&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Really, these things are great. Seriously great. There should be plenty of tutorials out there on how to use these tools. Some Youtube channels below will have a lot of tutorials on these.&lt;/p&gt;

&lt;p&gt;😎 &lt;strong&gt;By this point, you're good to go bro. You've got this.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  5 Continue Learning
&lt;/h1&gt;

&lt;h3&gt;
  
  
  Youtube
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Traversy Media&lt;/li&gt;
&lt;li&gt;Academind&lt;/li&gt;
&lt;li&gt;The Net Ninja&lt;/li&gt;
&lt;li&gt;Fireship&lt;/li&gt;
&lt;li&gt;Pretty Printed&lt;/li&gt;
&lt;li&gt;Code Drip&lt;/li&gt;
&lt;li&gt;Web Dev Simplified&lt;/li&gt;
&lt;li&gt;Tech with Tim&lt;/li&gt;
&lt;li&gt;Corey Schafer&lt;/li&gt;
&lt;li&gt;Svelte Master&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Udemy
&lt;/h3&gt;

&lt;p&gt;They do massive discounts fairly often. So when there is a discount, purchase some courses that you are interested in. Svelte, Vue, Flask, Express, Heroku, Docker, Firebase, Authentication, Payment, etc.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>python</category>
      <category>webdev</category>
      <category>sql</category>
    </item>
    <item>
      <title>Location Data for Cities and Towns</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Wed, 19 Feb 2020 20:10:21 +0000</pubDate>
      <link>https://dev.to/minchulkim87/location-data-for-cities-and-towns-ail</link>
      <guid>https://dev.to/minchulkim87/location-data-for-cities-and-towns-ail</guid>
      <description>&lt;h2&gt;
  
  
  Why might you want location data for cities and towns?
&lt;/h2&gt;

&lt;p&gt;Have you ever wanted to geocode some address data, but only to the city/town level - say, for privacy reasons or entity resolution purposes?&lt;/p&gt;

&lt;p&gt;Or maybe you just need to plot out all the major cities around the world on a map?&lt;/p&gt;

&lt;h2&gt;
  
  
  There is no easy way of obtaining such data.
&lt;/h2&gt;

&lt;p&gt;You could try:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google (or some other commercial) geocoding APIs on your lists of city names and towns, but that will probably cost you money.&lt;/li&gt;
&lt;li&gt;Open Street Maps (and APIs that support using OSM), but this isn't easy. Especially if you have a bulk data you are trying to geocode.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  WikiData has a very good source.
&lt;/h2&gt;

&lt;p&gt;OSM is not the only open data available. WikiData also has quite a bit of such data.&lt;/p&gt;

&lt;p&gt;This is how you would obtain the locations of cities and towns around the world.&lt;/p&gt;

&lt;p&gt;Go to &lt;a href="https://query.wikidata.org"&gt;https://query.wikidata.org&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;and use the following sparql query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sparql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?citytownLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?loc&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q7930989&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P625&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?loc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P17&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?country&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;SERVICE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;label&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nn"&gt;bd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;serviceParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After about a minute, you should get something like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--o7H0-jmh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/fc6m93ggsyd5892kijbm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--o7H0-jmh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/fc6m93ggsyd5892kijbm.png" alt="table" width="880" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can simply download the data as csv or json.&lt;/p&gt;

&lt;p&gt;You can also take a look by displaying the data on a map (there should be a dropdown on the left.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tV0FzbjI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0qaax4j52sjpsk3un0w6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tV0FzbjI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0qaax4j52sjpsk3un0w6.png" alt="map" width="880" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Interesting to see which communities are more active in open data contribution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--U3_aZa5p--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/bfs1diizgucz955ihoqx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--U3_aZa5p--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/bfs1diizgucz955ihoqx.png" alt="zoom" width="880" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And if you are interested in contributing the location data, or are a school teacher who can make such "data contribution" a geography class project, then the following query will give you a list of cities and towns without a location data, and the link to their wikidata pages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sparql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?citytownLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?loc&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q7930989&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P17&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?country&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;NOT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;EXISTS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?citytown&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P625&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?loc&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;SERVICE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;label&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nn"&gt;bd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;serviceParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will also realise that the towns that do not have location data tends to be missing a lot of other data too, like their English labels.&lt;/p&gt;

&lt;p&gt;WikiData and the community of contributors are amazing.&lt;/p&gt;

</description>
      <category>opendata</category>
      <category>sql</category>
      <category>sparql</category>
      <category>wikidata</category>
    </item>
    <item>
      <title>Best Free Data Courses for Beginners</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Wed, 29 Jan 2020 19:24:17 +0000</pubDate>
      <link>https://dev.to/minchulkim87/best-free-data-courses-for-beginners-1n2d</link>
      <guid>https://dev.to/minchulkim87/best-free-data-courses-for-beginners-1n2d</guid>
      <description>&lt;p&gt;&lt;em&gt;You might think this is yet another post about free resources for Python. In one sense, yes it is. But really, it is not.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Make your decision to learn SQL and Python (maybe even some R)
&lt;/h2&gt;

&lt;p&gt;It is the year 2020, and there are still post on blogs and Linkedin encouraging people to learn Python. Gone are the days of Excel Data Analysts. Watching youtube videos and reading blog posts, this overall trend may very well be true, but we are far from living in a time where all Data Analysts (and BI Analysts) know how to code. Many have progressed to using PowerBI or Tableau on top of Excel, but still, have much reluctance to learn to code.&lt;/p&gt;

&lt;p&gt;Many others have written about why Data Analysts should learn to program/code, and I do not have much to add there. But I will quickly mention three barriers that people face:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I'm not a mathematics/programming/technical person.&lt;/li&gt;
&lt;li&gt;I don't have the time.&lt;/li&gt;
&lt;li&gt;The degrees/courses seem expensive.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an ex-teacher, I hate the first statement. These skills are not beyond the reach of anybody. It is merely up to your motivation and mindset. And if you are not up for getting "technical" then perhaps get a different job.&lt;/p&gt;

&lt;p&gt;The second excuse is more realistic, and I don't think I can help much, other than to say that 5 hours per week for a couple of years can take you a long way. As professionals, we should always strive to learn and improve.&lt;/p&gt;

&lt;p&gt;Cost. &lt;strong&gt;Now, this is where this post can really help you.&lt;/strong&gt; Whether you are a student wanting to learn data science, a BI Analyst wanting to learn Python, or a corporate person (finance, HR, marketing, etc.) wanting to use analytics in your job, these courses will help you get started for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Free does not have to be 💩
&lt;/h2&gt;

&lt;p&gt;I have taken many (paid) courses from &lt;a href="https://www.edx.org/"&gt;edX&lt;/a&gt;, &lt;a href="https://www.coursera.org/"&gt;Coursera&lt;/a&gt;, &lt;a href="https://www.udacity.com/"&gt;Udacity&lt;/a&gt;, &lt;a href="https://www.datacamp.com/"&gt;DataCamp&lt;/a&gt;, &lt;a href="https://www.udemy.com/"&gt;Udemy&lt;/a&gt;, and a degree from a &lt;a href="https://london.ac.uk/"&gt;University&lt;/a&gt;. They all have their strengths and weaknesses, styles and emphases. I'll make a separate post about my thoughts on those perhaps, but for now, let me introduce their (surprisingly less known) free counterparts.&lt;/p&gt;

&lt;p&gt;Firstly, while edX and Coursera are great, their "free" version did not turn out to be completely free. Last time I checked, the free auditing came with a time limit, usually a few weeks before you lose the free access. Secondly, university degrees are (to my knowledge) not free, nor is Udemy. Lastly, while Udemy can be cheap when their discounts are on, even at the low cost, I wouldn't recommend Udemy for data-science-related courses - despite most other posts listing Udemy and Coursera as their top two choices.&lt;/p&gt;

&lt;p&gt;That leaves DataCamp and Udacity. What surprised me is that many people did not seem to know that they offer free courses and tutorials. I am also adding the Kaggle courses (again, surprisingly lesser-known) to the list.&lt;/p&gt;

&lt;h1&gt;
  
  
  The free courses
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Udacity
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--VIjwwuA2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ye4q34i0m76e36tjfvxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--VIjwwuA2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ye4q34i0m76e36tjfvxf.png" alt="Udacity" width="200" height="237"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you go to &lt;a href="https://www.udacity.com/courses/all"&gt;https://www.udacity.com/courses/all&lt;/a&gt;, you will see the entire list of Udacity's courses. You can then filter to see only the Free Courses. My top picks for beginners are:&lt;/p&gt;

&lt;h3&gt;
  
  
  Programming
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.udacity.com/course/intro-to-html-and-css--ud001"&gt;Intro to HTML and CSS | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/intro-to-javascript--ud803"&gt;Intro to JavaScript | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/introduction-to-python--ud1110"&gt;Introduction to Python Programming | Udacity&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Statistics
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.udacity.com/course/intro-to-statistics--st101"&gt;Intro to Statistics | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/intro-to-descriptive-statistics--ud827"&gt;Intro to Descriptive Statistics | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/intro-to-inferential-statistics--ud201"&gt;Intro to Inferential Statistics | Udacity&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Analysis
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.udacity.com/course/sql-for-data-analysis--ud198"&gt;SQL for Data Analysis | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/data-analysis-with-r--ud651"&gt;Data Analysis with R | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/intro-to-data-analysis--ud170"&gt;Intro to Data Analysis | Udacity&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Wrangling and Machine Learning
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.udacity.com/course/data-wrangling-with-mongodb--ud032"&gt;Data Wrangling with MongoDB | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/machine-learning--ud262"&gt;Machine Learning | Udacity&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.udacity.com/course/deep-learning-pytorch--ud188"&gt;Intro to Deep Learning with PyTorch | Udacity&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Kaggle
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--VnpydbgO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/sky6sflr7v9ggxi6855c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--VnpydbgO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/sky6sflr7v9ggxi6855c.png" alt="Kaggle" width="200" height="77"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kaggle is not just the data science competition grounds it once was. Kaggle not only provides free online compute environment, it also has &lt;a href="https://www.kaggle.com/learn/overview"&gt;several courses&lt;/a&gt; offered for free. The Udacity free courses are more video based, while the Kaggle courses will be entirely notebook based. The Kaggle courses will &lt;strong&gt;complement&lt;/strong&gt; the Udacity courses nicely.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.kaggle.com/learn/python"&gt;Learn Python Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/pandas"&gt;Learn Pandas Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/data-visualization"&gt;Learn Data Visualization Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/intro-to-sql"&gt;Learn Intro to SQL Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/advanced-sql"&gt;Learn Advanced SQL Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/intro-to-machine-learning"&gt;Learn Intro to Machine Learning Tutorials | Kaggle&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.kaggle.com/learn/intermediate-machine-learning"&gt;Learn Intermediate Machine Learning Tutorials | Kaggle&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  DataCamp
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eAkmIOo4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/boqm953qw5f6ds5bwtfa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eAkmIOo4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/boqm953qw5f6ds5bwtfa.png" alt="DataCamp" width="200" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;DataCamp is very affordable and very comprehensive in its library of courses, covering spreadsheets, SQL, Tableau, R, Scala, Python, and probably growing. But, if you want to access their free content &lt;em&gt;beyond just the first parts of their paid courses&lt;/em&gt;, they do have a blog-like community-created &lt;a href="https://www.datacamp.com/community/tutorials/"&gt;tutorial content&lt;/a&gt; as well. These are not created by DataCamp, and their qualities may vary, but some of the great ones are kindly tagged as "must read". These tutorials can be a great &lt;strong&gt;supplement&lt;/strong&gt; to other courses. Here are my picks:&lt;/p&gt;

&lt;h3&gt;
  
  
  Improving your data wrangling game
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.datacamp.com/community/tutorials/sql-tutorial-query"&gt;SQL Tutorial: How To Write Better Querie | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/tutorial-postgresql-python"&gt;Using PostgreSQL in Python | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/introduction-mongodb-python"&gt;Introduction to MongoDB and Python | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/importing-data-into-pandas"&gt;Importing Data into Pandas | DataCamp&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Improving your pandas game
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python"&gt;Pandas Tutorial: DataFrames in Python | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/pandas-idiomatic"&gt;5 Tips To Write Idiomatic Pandas Code | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/exploratory-data-analysis-python"&gt;Python Exploratory Data Analysis Tutorial | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/time-series-analysis-tutorial"&gt;Python Time Series Analysis Tutorial | DataCamp&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Practicing machine learning
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.datacamp.com/community/tutorials/essentials-linear-regression-python"&gt;Essentials of Linear Regression in Python | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/k-nearest-neighbor-classification-scikit-learn"&gt;KNN Classification using Scikit-learn | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/k-means-clustering-python"&gt;K-Means Clustering with scikit-learn | DataCamp&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.datacamp.com/community/tutorials/principal-component-analysis-in-python"&gt;(Tutorial) Principal Component Analysis (PCA) in Python | DataCamp&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;As you can see, there are plenty of free resources available from very reputable training providers. You can go a very long way with these free courses alone, but I would recommend that once you are done testing the waters out, then go for the paid courses from a combination of education providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.datacamp.com/"&gt;DataCamp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.udacity.com/"&gt;Udacity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.edx.org/"&gt;edX&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.coursera.org/"&gt;Coursera&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bonus: Youtube
&lt;/h2&gt;

&lt;p&gt;Depending on your interests, there are a plethora of resources available on Youtube as well (although, you would need to be able to sift through what is good and what is not). Some very popular Youtubers turned out to be plaigerisers, and some others, whilst pretty good, write code in a style that I would not want to collaborate with. Don't get me wrong, there are some that are great, it is just that you would already have to have a sense of what is a "better" code to decide which videos are high quality. Having said that, there are really great "theory-explaning" videos on Youtube that cannot be found elsewhere. So my advice would be: don't try to learn SQL, R, and Python using Youtube, but use Youtube to look up some theories you want to get a quick understanding of. I've prepared a &lt;a href="https://www.youtube.com/playlist?list=PLbDGs3QXnzJBQIkXMX4ceKdcJlK-fcSnx"&gt;playlist for machine learning&lt;/a&gt; if you are interested.&lt;/p&gt;

&lt;p&gt;Happy learning!&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>python</category>
      <category>sql</category>
      <category>analysis</category>
    </item>
    <item>
      <title>Documenting SQL with Markdown and Diagrams</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Sun, 26 Jan 2020 04:15:34 +0000</pubDate>
      <link>https://dev.to/minchulkim87/documenting-sql-with-markdown-and-diagrams-2lhp</link>
      <guid>https://dev.to/minchulkim87/documenting-sql-with-markdown-and-diagrams-2lhp</guid>
      <description>&lt;h2&gt;
  
  
  Nobody seems to document SQL codes
&lt;/h2&gt;

&lt;p&gt;I find that SQL code is often undocumented. A lot of SQL code don't get the &lt;em&gt;sweet documentation love&lt;/em&gt; that python and javascript codes so often do.&lt;/p&gt;

&lt;p&gt;Some commented out lines of code, which supposedly apply a particular business rule over another, are left hanging without what they were supposed to do. Some comments allude to warnings about the consequences of changing particular join conditions, without an explanation as to why.&lt;/p&gt;

&lt;p&gt;Many SQL codes can go over a dozen layers of join paths deep, with multiple subqueries, badly written split-apply-combine methods, and inexplicably named aliases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Someday that might bite us on our backsides
&lt;/h2&gt;

&lt;p&gt;Executives of the business may one day get an automatically generated report one day and question "how the figures were generated" (Think "why is this number so low?"). With people moving from one job to another, undocumented SQL code jibberish is all that the analyst has to go by (&lt;em&gt;if&lt;/em&gt; the analyst is given access to code at least).&lt;/p&gt;

&lt;p&gt;A lot of data pipelines rely on SQL, and they really should be documented better - explain to the user the code that they are looking at. Throw in a diagram perhaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's use markdown in a comment block
&lt;/h2&gt;

&lt;p&gt;But we shouldn't have to write a separate document that does these things. Separating the documentation from the code increases the chances that the documentation will not be updated. Documentation should really live in the code as comments. I like the simplicity of markdown, and think it would be nice if I could use markdown within the comment block of the sql to document the code.&lt;/p&gt;

&lt;h2&gt;
  
  
  There's a tool for that.
&lt;/h2&gt;

&lt;p&gt;A little while ago, I posted about documenting Python code for data science. I had built a package called &lt;a href="https://minchulkim87.github.io/mindoc/" rel="noopener noreferrer"&gt;mindoc&lt;/a&gt; that allowed me to document my pipeline code using markdown within my code.&lt;br&gt;
&lt;/p&gt;
&lt;div class="ltag__link"&gt;
  &lt;a href="/minchulkim87" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F304691%2Fcdd644a5-27d7-4fce-bf3f-7235af3ed8fc.png" alt="minchulkim87"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="/minchulkim87/documenting-python-data-science-code-with-mindoc-25b" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Documenting Python Data Science Code with mindoc&lt;/h2&gt;
      &lt;h3&gt;Min ・ Jan 18 '20&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#python&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#documentation&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#datascience&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#productivity&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;Now I have expanded the tool to help document SQL code and added support for diagrams (using mermaid js)!&lt;/p&gt;

&lt;p&gt;See this github page for how it works: &lt;a href="https://minchulkim87.github.io/mindoc/" rel="noopener noreferrer"&gt;https://minchulkim87.github.io/mindoc/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>python</category>
      <category>documentation</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Documenting Python Data Science Code with mindoc</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Sat, 18 Jan 2020 12:00:59 +0000</pubDate>
      <link>https://dev.to/minchulkim87/documenting-python-data-science-code-with-mindoc-25b</link>
      <guid>https://dev.to/minchulkim87/documenting-python-data-science-code-with-mindoc-25b</guid>
      <description>&lt;ol&gt;
&lt;li&gt;You can write amazingly readable code in python.&lt;/li&gt;
&lt;li&gt;Think of the hierarchies of abstraction the same you would reports.&lt;/li&gt;
&lt;li&gt;mindoc can help you document your code like markdown.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://minchulkim87.github.io/mindoc/"&gt;https://minchulkim87.github.io/mindoc/&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Write amazingly readable code in python.
&lt;/h1&gt;

&lt;p&gt;This talk was a game-changer to me:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.youtube.com/watch?v=MpFZUshKypk"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KDKDemtu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/http://img.youtube.com/vi/MpFZUshKypk/0.jpg" alt="untitled12.ipynb" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I shared it with my colleague, and it changed how we used the already-amazing Jupyter notebook.&lt;/p&gt;

&lt;p&gt;I was already trying to write code as readable to other humans as I though possible, but the talk gave me a better template to work with.&lt;/p&gt;

&lt;p&gt;Basically, the idea is to write code that even a non-dev could understand if they tried.&lt;/p&gt;

&lt;p&gt;In pandas, for example, the equivalent way of writing code would be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;group_data_by_customer_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;compute_statistics&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generate_visualisation&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even a non-data-person or a non-coder would recognise that the data was being cleaned, then grouped by customer type, then some statistics are being calculated before the visualisation is created.&lt;/p&gt;

&lt;p&gt;With this newfound excitement over just how &lt;strong&gt;clean&lt;/strong&gt; our code started to look, we quickly started re-factoring our data pipeline codes.&lt;/p&gt;

&lt;h1&gt;
  
  
  And then came the awkward moment
&lt;/h1&gt;

&lt;p&gt;The great thing is that we were able to abstract away what steps like the "clean_data" were! If the reader was curious about what was meant by "cleaning" the data in this code, they simply need to look at the clean_data code, which may look like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rename_columns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remove_special_characters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;correct_date_format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;replace_values&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, the steps taken in the clean_data "phase" were obvious. Again, you could abstract away further details. Great!&lt;/p&gt;

&lt;p&gt;Then we hit our first moment of 'um...'. We constantly had decisions to make about exactly what we would abstract away. And to what end? Were we to get down to the lowest level of "atomic" manipulations of the data?&lt;/p&gt;

&lt;p&gt;Once you get to over a hundred such functions, how do you order them? In the order they are used? What if they are used multiple times? Alphabetically? But then the code stops "explaining itself".&lt;/p&gt;

&lt;p&gt;Our project seemed way too large for this way of coding to scale to our needs. Not to mention that even the most readable code becomes too "technical" to the manager or other business people who are not likely programmers or "data people". We needed a way to write "documentation" for both the other wizards and the muggles.&lt;/p&gt;

&lt;h1&gt;
  
  
  Then came the Aha! moment
&lt;/h1&gt;

&lt;p&gt;I remembered watching another amazing talk (for a javascript audience, but the message applied to any developer):&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.youtube.com/watch?v=BzX4aTRPzno"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lZS07hDn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/http://img.youtube.com/vi/BzX4aTRPzno/0.jpg" alt="Write Less, Do More" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then I suggested that we think of organising the code as if we were writing a Word document. The levels of abstraction were equivalent to levels of headings. If we were to write the code in English, what would the title, headings, and subheadings be?&lt;/p&gt;

&lt;p&gt;We decided it was a good idea to arrange our code this way, and started to use the comments in python to indicate headings.&lt;/p&gt;

&lt;p&gt;The commenting syntax resembles a markdown header.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;# This is a comment. Or... is it a header?&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Surely this made sense to a lot of other people. Surely, there was some standard practice around documenting data analysis, data engineering, and data science projects &lt;em&gt;with code&lt;/em&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  We took to the python documentation tools.
&lt;/h1&gt;

&lt;p&gt;We envied using Sphinx. But while it seemed great for documenting packages, it didn't meet the needs of a data project.&lt;/p&gt;

&lt;p&gt;First of all, "documentation" for python modules uses docstrings heavily. These are useful for creating a manual of sorts to the user of the module. What methods are available, what arguments I can give this function, etc.&lt;/p&gt;

&lt;p&gt;But the point of this new way of coding was that the code was "self-describing". Did I really need to start writing like this?:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;"""
    This function cleans the dataframe.

    First the columns are renamed.
    Then the special characters are removed.
    Then the date format is corrected.
    Then some values are replaced with others.

    Args:
        df (pd.DataFrame): Pandas DataFrame to be cleaned

    Returns:
        pd.DataFrame: The cleaned Pandas DataFrame
    """&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rename_columns&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remove_special_characters&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;correct_date_format&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pipe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;replace_values&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Surely, the names of the functions and the python type hinting already fully details the function. The "documentation" we needed was to document the &lt;em&gt;process&lt;/em&gt; rather than the &lt;em&gt;tool&lt;/em&gt;. Like I wrote above, the documentation of the tool is absolutely required. The processes used in a data project that needs to be used by others? Not so much. The process should be documented though.&lt;/p&gt;

&lt;p&gt;There was Pycco, which we drew some ideas from, but the presentation made it feel like it was &lt;em&gt;annotating&lt;/em&gt; the code rather than &lt;em&gt;documenting&lt;/em&gt; the code. Again, there is a place for annotation, but it didn't quite meet our needs.&lt;/p&gt;

&lt;p&gt;There was an added limitation of being a public servant. We did not have the luxury of using "pip install" - especially when it came to upgrading JupyterLab or extending them. Even if we discovered a solution, if it required packages we did not have access to, we would not be able to use it that easily.&lt;/p&gt;

&lt;p&gt;So. I decided to write a simple documentation tool.&lt;/p&gt;

&lt;h1&gt;
  
  
  Introducing mindoc
&lt;/h1&gt;

&lt;p&gt;It is a simple tool that converts a .py file into an HTML document.&lt;/p&gt;

&lt;p&gt;It is basically a flipped version of markdown.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In markdown, you write a document and put code between fenced triplets of backticks.&lt;/li&gt;
&lt;li&gt;With mindoc, you write code and put the documentation between fenced triplets of quotes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The use of the headers allows you to organise your code in hierarchical levels, and "hide" away whichever level of granularity you wish to make the logic of the code more readable.&lt;/p&gt;

&lt;p&gt;It automatically generates the table of contents for you too.&lt;/p&gt;

&lt;p&gt;Go check it out on &lt;a href="https://minchulkim87.github.io/mindoc/"&gt;GitHub&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;The page that you see from the link &lt;em&gt;is&lt;/em&gt; the .py file (converted into HTML).&lt;/p&gt;

</description>
      <category>python</category>
      <category>documentation</category>
      <category>datascience</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Company data using Wikidata</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Sat, 11 Jan 2020 04:56:13 +0000</pubDate>
      <link>https://dev.to/minchulkim87/company-data-using-wikidata-n19</link>
      <guid>https://dev.to/minchulkim87/company-data-using-wikidata-n19</guid>
      <description>&lt;ol&gt;
&lt;li&gt;Company data is useful.&lt;/li&gt;
&lt;li&gt;Company information is hard to find in bulk.&lt;/li&gt;
&lt;li&gt;Wikidata can be a useful starting point.&lt;/li&gt;
&lt;li&gt;Wikidata can be great for collecting other information too.&lt;/li&gt;
&lt;li&gt;Let's contribute to open data, not just open source codes.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  Company data is really useful
&lt;/h1&gt;

&lt;p&gt;Whether you are doing competitor analysis, econometric analysis, or policy analysis, it would be great to have information about companies. Working in the government, we are often asking how a new public policy might affect small businesses. Necessarily, we need to have data about the list of businesses, and whether they are large or small.&lt;/p&gt;

&lt;h1&gt;
  
  
  Collecting information about companies is hard
&lt;/h1&gt;

&lt;p&gt;You may think that there are government bodies that collect this information. After all, corporate taxes are a big part of how governments source their funds. However, just because one part of the government has a certain set of data, it does not mean that other parts of the government can access that data. Much of these restrictions are necessary for privacy - and I am not here to debate data sharing and governance policies - but we can agree that having access to such data can benefit researchers.&lt;/p&gt;

&lt;h1&gt;
  
  
  Some options for company data
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Buy the data
&lt;/h2&gt;

&lt;p&gt;For the very reason that the company data is hard to get, there are organisations that sell such information. Organisations such as universities and government agencies purchase these data, without many alternatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get data provided by some governments
&lt;/h2&gt;

&lt;p&gt;Some governments are pretty good with organising and releasing data. Australia, for example, has &lt;a href="//data.gov.au"&gt;data.gov.au&lt;/a&gt; where they host information such as a list of businesses and their ABNs (Australian Business Numbers) so that you can identify them. Also on data.gov.au is a dataset released for tax transparency purposes for "large" companies.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A quick side note: definitions of large companies can vary. You could use the annual turnover (revenue) and/or the number of employees as measures of size. It would depend on what question you are trying to answer.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Try and get publically available data
&lt;/h2&gt;

&lt;p&gt;You could try some form of web scraping. Some very nice people at &lt;a href="https://www.peopledatalabs.com/"&gt;www.peopledatalabs.com&lt;/a&gt; have released what they have collected (I suspect from Linkedin, by scraping or by using the API). This dataset would be limited to those with a Linkedin presence and has no information about revenue.&lt;/p&gt;

&lt;p&gt;Companies publish annual financial reports and searching for these online could be one way of gathering information. I guess that is what some of the data vendors do. This, as you can imagine, is a tedious and time-consuming task. Perhaps there should be an open-source community whose mission is the make this data available and accessible. Such an initiative does in fact exist: &lt;a href="https://opencorporates.com/"&gt;OpenCorporates&lt;/a&gt;. But to me, this data was exposed more as a search tool (as opposed to a downloadable bulk data), and there seems to be a lot of inconsistent duplication of data. At least for what I was trying to do, I could not work out how to even select the "closest" match for the search results. I'm sure that there are many use cases for this tool, but it did not suit my use case. Also, this data isn't exactly "free".&lt;/p&gt;

&lt;h1&gt;
  
  
  Wikidata
&lt;/h1&gt;

&lt;p&gt;This is where Wikidata comes in. "Wikidata?" you may ask. Here is what the website says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Wikidata is a free and open knowledge base that can be read and edited by both humans and machines.&lt;br&gt;
Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There is an enthusiastic talk about it in youtube as well:&lt;br&gt;
&lt;a href="http://www.youtube.com/watch?v=24DOvuZWaD0"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--F46lKs-z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/http://img.youtube.com/vi/24DOvuZWaD0/0.jpg" alt="Wikidata" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It is really amazing. &lt;em&gt;We should really consider supporting the &lt;a href="https://wikimediafoundation.org/"&gt;Wikimedia Foundation&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So I have been playing around with the SPARQL (feels like SQL but with its own dialect) (&lt;a href="https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial"&gt;tutorial here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Specifically for finding "large companies" I have developed these following codes that you can run yourself at &lt;a href="https://query.wikidata.org/"&gt;Wikidata Query Service&lt;/a&gt;. Note that I have tried to gather ISNI and GRID ids as the identifiers that can be used to join with other sources of information if required.&lt;/p&gt;

&lt;p&gt;For largest companies by (latest available information about) number of employees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sparql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?businessLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?employees&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4830453&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;SERVICE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;label&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;bd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;serviceParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P17&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?country&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P213&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2427&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1448&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1813&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1128&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?employees&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?employees&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;BY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?employees&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For largest companies by (latest available information about) revenue (converted to USD):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sparql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?businessLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?businessLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?max_date&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4830453&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;SERVICE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;label&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;bd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;serviceParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P213&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2427&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1448&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1813&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P17&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?country&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;p&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2139&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pq&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P585&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;GROUP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;BY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?businessLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd_recorded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4830453&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;p&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2139&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pq&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P585&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;psv&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2139&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
               &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityAmount&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityUnit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4917&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;BIND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4917&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?unit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;BIND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd_recorded&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd_recorded&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100000000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;UNION&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nv"&gt;?statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;psv&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2139&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
               &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityAmount&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityUnit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?unit&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?unit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4917&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nv"&gt;?unit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;p&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2284&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?unit_statement&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nv"&gt;?unit_statement&lt;/span&gt;&lt;span class="w"&gt;
               &lt;/span&gt;&lt;span class="nn"&gt;psv&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2284&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityUnit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4917&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;quantityAmount&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?usd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;BIND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?usd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd_recorded&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd_recorded&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100000000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
         &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="k"&gt;GROUP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;BY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?date&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?max_date&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;BY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?revenue_usd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not all companies will have entries for employee numbers or revenue. I have also tried getting publicly traded (securities) companies with the assumptions that such companies will be large. For these companies, I have also extracted their ISINs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sparql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?businessLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?countryLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?stockexchangeLabel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isin&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P31&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P279&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;Q4830453&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P17&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?country&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P414&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?stockexchange&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;SERVICE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;label&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;bd&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;serviceParam&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wikibase&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;language&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P213&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isni&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P2427&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?grid_id&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1448&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?officialname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P1813&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;FILTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;LANG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;?shortname&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="k"&gt;OPTIONAL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;wdt&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="ss"&gt;P946&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?isin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;BY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;?business&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I hope that someone may find this information useful. But I also hope that others can improve on this, because I would really like to access reliable and maintained data about companies - one with (official) identifiers, revenue data, employee numbers, and ultimate global owners. I must admit, I am not great with SPARQL.&lt;/p&gt;

&lt;p&gt;Also, I can't thank all the people contributing to Wikipedia enough. Again: &lt;em&gt;We should really consider supporting the &lt;a href="https://wikimediafoundation.org/"&gt;Wikimedia Foundation&lt;/a&gt;&lt;/em&gt; - through contributing data or donations.&lt;/p&gt;

</description>
      <category>database</category>
      <category>opensource</category>
      <category>wikidata</category>
      <category>datascience</category>
    </item>
    <item>
      <title>My Data Science Tech Stack 2020</title>
      <dc:creator>Min</dc:creator>
      <pubDate>Tue, 31 Dec 2019 05:32:15 +0000</pubDate>
      <link>https://dev.to/minchulkim87/my-data-science-tech-stack-2020-1poa</link>
      <guid>https://dev.to/minchulkim87/my-data-science-tech-stack-2020-1poa</guid>
      <description>&lt;ol&gt;
&lt;li&gt;There is a lot to learn in data science.&lt;/li&gt;
&lt;li&gt;We can group the technologies by the subfields of data science.&lt;/li&gt;
&lt;li&gt;There are a few key technologies for each subfield to focus on.&lt;/li&gt;
&lt;li&gt;Creating this personal tech stack list was a fun and useful exercise.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  There is a lot of things to know in Data Science.
&lt;/h1&gt;

&lt;p&gt;If we tried to survey the technologies used by Data Scientists, we might get a picture like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fw9iry6gui9ane19djuv5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fw9iry6gui9ane19djuv5.png" alt="Tech Landscape"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This list is in no way comprehensive (this is already filtered based on my personal interests). On top of that, this list will change year-to-year. Keeping up with everything will be impossible. Thankfully, no Data Scientists will need to know or use all of these tools.&lt;/p&gt;

&lt;p&gt;But since we should strive to be T-shaped people, we should at least learn a good chunk of these technologies, right? But where do we begin? How much should we learn, and what technologies? First, we should discuss briefly the nature of the profession itself.&lt;/p&gt;

&lt;h1&gt;
  
  
  Data science is a broad and loosely defined field.
&lt;/h1&gt;

&lt;p&gt;Data science is still relatively young and continues to evolve. Once popularised by the tagline "sexiest job of the 21st century", many people were attracted to the interesting profession.&lt;/p&gt;

&lt;p&gt;What may have begun as an application of statistics to solve business problems, is now a name that encompasses areas of big data engineering, visualisation, machine learning, deep learning and artificial intelligence. The rapid evolution was in part due to the breadth of areas in which data science can be applied to, but also because the technologies have also developed rapidly. The number of skills that a Data Scientist must possess has grown with the nebulous definition of the job.&lt;/p&gt;

&lt;p&gt;I imagine a one-person Data Scientist in a small organisation would have a different set of tasks to do compared to a Data Scientist in a team within a large organisation. I also imagine that the exact job will depend very much on the industry and the nature of the organisation. Compounded with the rapid decrease in job tenure (or an increase in job mobility), this variance in the job description requires the practising Data Scientist to keep up with a large number of skills and technologies.&lt;/p&gt;

&lt;p&gt;If we had to group some of the subfields of data science, they would look something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Analysis&lt;/strong&gt;: This part of the job is about understanding the data. It involves data wrangling, exploratory data analysis, and "explanatory" data analysis. In a larger team, dedicated &lt;em&gt;Data Analysts&lt;/em&gt; will perform these tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Visualisation&lt;/strong&gt;: This part of the job is about communicating the data, usually to a non-technical audience. In a larger team, dedicated &lt;em&gt;Business Intelligence Analysts&lt;/em&gt; will perform these tasks, although this can be a part of the Data Analyst's duties.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Machine Learning&lt;/strong&gt;: This part of the profession is probably where the "sexy" comes from. Using regression, classification, and clustering to solve a wide range of problems including computer vision and natural language processing. Sometimes, the people who develop new and better ways of solving problems through machine learning are called &lt;em&gt;Machine Learning Scientists&lt;/em&gt; and the people who implement the solutions are called &lt;em&gt;Machine Learning Engineers&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Engineering&lt;/strong&gt;: This part of the field has become so important that &lt;em&gt;Data Engineers&lt;/em&gt; are more in demand than Data Scientists. To do data science, we need data and tools. Making these available is what data engineering is about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud DevOps&lt;/strong&gt;: More and more, both the data and the tools required to do data science are being made available on the cloud. Navigating a large number of cloud products, managing the scalable infrastructure, and managing the access and security are the duties of the &lt;em&gt;Cloud DevOps Engineers&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Development&lt;/strong&gt;: This part might seem out of place, but if we consider the end-to-end data science projects, then the web is most likely the prototyping or deployment solution. In larger teams, there may be a team of &lt;em&gt;Front-End&lt;/em&gt; and &lt;em&gt;Full-Stack Developers&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sure, these groupings are not clear cut and there are overlaps. At least, these groups give me some way of organisation. I have noted in the brief descriptions, these roles can be carried out by dedicated specialists in the team. But in a small organisation, it could be up to the one-person generalist data scientist to carry out all of these functions.&lt;/p&gt;

&lt;p&gt;Whether we need to perform all of these roles or not, it would be helpful to understand a little bit about what other people in the team do. Or perhaps you are looking to switch your career track, say from a data analyst to a data engineer or from a web developer to a machine learning engineer, in which case, you will benefit by knowing something about everything.&lt;/p&gt;

&lt;p&gt;Coming back to the tech stack, we can (loosely) group the technologies according to these roles.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Ff4upva8boc5dc5giu7fq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Ff4upva8boc5dc5giu7fq.png" alt="Grouped landscape"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;EDIT: (Notes on what these are added towards the end of the post)&lt;/p&gt;

&lt;h1&gt;
  
  
  We can't be experts in everything.
&lt;/h1&gt;

&lt;p&gt;Scanning through job descriptions and MOOCs, we can probably narrow down the very employable stack to something like the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F9rnujgv697bmk2jq3yip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F9rnujgv697bmk2jq3yip.png" alt="broad tech stack"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Even this is too much to truly master. Even if I have touch points with all of these technologies, I wouldn't need to learn all of them. I would either be working with specialists or only use them to the extent that can be handled by reading through the documentation.&lt;/p&gt;

&lt;p&gt;But I think I can remove some "duplicates" or remove some from my personal "core" stack from a learning standpoint.&lt;/p&gt;

&lt;h1&gt;
  
  
  So, my tech stack is...
&lt;/h1&gt;

&lt;p&gt;After much consideration, my data science "core" stack for the coming years will look something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F7ci7swpmz9ziypcv0dmq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F7ci7swpmz9ziypcv0dmq.png" alt="My tech stack"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Even though this is just a silly exercise, I still struggled to come up with the final, and personal, tech stack. I am not sure that this is the ideal minimalist stack. And even if it was, I won't be able to use many of these at my current job. I would probably use this "data science core stack" if I were to embark on a personal project or start my own tech company. There are other technologies that I have learned that I will continue to keep up with (Tidyverse, Spark, and Airflow), additional technologies I will learn a little bit about this year (Vue perhaps) knowing that I probably won't use them. I also recognise that one must choose the right tool for the job and that this list may look different in a couple of years' time.&lt;/p&gt;

&lt;p&gt;Nevertheless, I think doing this kind of exercise once in a while is helpful in reassessing the pros and cons of each technology and getting the feel for the overall landscape. I have probably read over a hundred blog posts and a hundred videos about the latest developments and commentaries about the technologies listed in the first diagram and many more. This in itself was a valuable exercise. It also helped me narrow my focus on what I wanted to learn and why I should prioritise them, because realistically, I won't be able to (nor do I need to) learn all of them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;So, what's your tech stack?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Some thoughts...
&lt;/h2&gt;

&lt;p&gt;I've jotted down some thoughts that went through my head as I was reducing my tech stack. These are all personal opinions, but the final tech stack is a personal one. So I think that it's okay.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is a lot of knowledge that is a co-requisite for some packages like statsmodels, scikit-learn, and Tensorflow. We shouldn't forget that there are whole fields of mathematics such as probability, statistics, linear algebra, vector calculus, econometrics, and machine learning algorithms that form the foundations of data analysis and machine learning. These take up a very big part of the skillset of a data scientist and therefore provides motivation to reduce what would be included in the core tech stack.&lt;/li&gt;
&lt;li&gt;Computation is also a big part of data science, and good programming skills are essential: best practices (think the Zen of Python, PEP8, and PEP484 for example), programming paradigms (imperative vs declarative, procedural vs object-oriented vs functional, etc.), data structures and algorithms. It is not enough to write code that works, it is also important to write code that other humans can read. Getting deep into the best practises and latest inclusions in Python, JavaScript, R, and SQL is itself quite a demanding. Venturing into other languages such as Scala, Go, Rust or C may not be a luxury that all of us can afford. Another reason to minimise the tech stack.&lt;/li&gt;
&lt;li&gt;Contrary to some popular narrative 80% of a data scientist's job is not data wrangling and cleaning. Half of the job would be meeting various people to understand requirements, communicate insights, and educate the benefits; as well as administrative duties, data governance duties, and professional development. Technical parts are perhaps half the story, and professional Data Scientists should also develop their management, communication, and design thinking skills. Yet another reason to reduce the technology fatigue by minimising the tech stack.&lt;/li&gt;
&lt;li&gt;I think that some technologies are similar enough that learning one would provide the transferrable skills required to learn others. For example, knowing Tableau will probably make learning PowerBI much easier. So learn one, and we can pick up the different nuances if the job requires.&lt;/li&gt;
&lt;li&gt;Similarly, where two different technologies essentially do the same job but are just different implementations, we can consider "duplicates". For example, PyTorch and Tensorflow are both very good deep learning packages, picking any one of the two would be a good choice.&lt;/li&gt;
&lt;li&gt;Some competing technologies are all worth keeping for different reasons. ggplot2 is the defacto visualisation tool for R. I have read some blog posts written in 2019 that still claim that R has better visualisation capabilities compared to Python. I think this is one of the reasons why we should take some time once in a while to update our knowledge about the data science tech landscape. Altair is arguably a better implementation of the grammar of graphics than ggplot2. But Altair uses Vega-lite (built on top of D3) which is very suitable for the web. For "print", I think that seaborn is the best. Within notebook environments, I think that Plotly Express is a very good candidate. At this point, I don't think I could choose between Altair, seaborn, and Plotly Express. All three are declarative and really easy to learn and use. I would probably continue to use all three. I would consider Altair and seaborn to be a part of the pandas ecosystem, and Plotly Express to be a part of the Plotly ecosystem (together with Plotly Dash).&lt;/li&gt;
&lt;li&gt;Some technologies are easier to learn than others. For example, React and Angular are powerful front-end frameworks (or library), but may not be the easiest to master. Some say that Vue, another front-end framework, takes the best of both styles and is easier to learn. Given that I am not looking to specialise in web development, I think Vue or even Svelte will suffice.&lt;/li&gt;
&lt;li&gt;In fact, some technologies are so easy that they are almost not worth "learning" or need keeping up-to-date that much. For example, HTML, CSS, Excel and Tableau. I think I can put these under the "assumed skills" category.&lt;/li&gt;
&lt;li&gt;One could say the same about SQL, but I think there are enough dialects within SQL and No-SQL "languages" built similar to SQL that it is worth keep reading up on. In my diagram, I am including all these peripheral and related things within "SQL".&lt;/li&gt;
&lt;li&gt;Speaking of including all related things, much like Tidyverse contains a lot of packages within it, I am including a lot of related packages within the pandas logo in the diagram above: NumPy, SciPy, matplotlib, seaborn, Altair, pandas-profiling, and pyjanitor for example. But I have separated out statsmodels because of the magnitude of co-requisite knowledge required to wield this package.&lt;/li&gt;
&lt;li&gt;Some technologies are "closer together". For example, while R may be better for statistics and econometrics, Python's statsmodels have caught up significantly. Since Python is useful for web development as well as machine learning, an argument can be made for using statsmodels over R. This is a hard balance to make. On the one hand, there are economic gains to be made by minimising the number of languages to strive for mastery in. But R still appears to be more sophisticated. And in my experience, learning Tidyverse (in R) helped me become a better pandas (in Python) user. In this post, I am somewhat trying to be more economical, so if I had to pick, I would drop R and focus on Python. I would still happily use R if required for a specific job.&lt;/li&gt;
&lt;li&gt;Some technologies are "more native" than others. Spark and Airflow are perhaps more mature and widely used compared with the Dask and Prefect duo. But people on Youtube and tech blogs tell me that Dask and Prefect are more "pythonic" than Spark and Airflow. Similar parallels can be drawn in Javascript land with Svelte versus React. I am not certain that these comparisons can be meaningful nor whether the differences are significant. This is another difficult balance to strike. On the one hand, we want to be more employable by keeping up with the technologies in vogue. On the other hand, concepts such as good data engineering and programming best practices are transferrable, and we prefer to learn what is easier to implement and learn. Both Spark and Airflow are easy enough to use if they have been made available, but I wouldn't particularly enjoy configuring Spark clusters and Airflow on my laptop.&lt;/li&gt;
&lt;li&gt;The "cloud" technologies are developing too rapidly, and I am not sure about learning a core stack just in case my future job will require me knowing how to use them. With the release of products like Google Cloud Run, I am not sure whether Kubernetes would be worth learning for the generalist data scientist.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  List of tools in image
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Languages
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fyz70uen9vu7wh0v98i2y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fyz70uen9vu7wh0v98i2y.png" alt="Pro"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python&lt;/li&gt;
&lt;li&gt;SQL&lt;/li&gt;
&lt;li&gt;R&lt;/li&gt;
&lt;li&gt;Scala&lt;/li&gt;
&lt;li&gt;HTML&lt;/li&gt;
&lt;li&gt;CSS&lt;/li&gt;
&lt;li&gt;JavaScript&lt;/li&gt;
&lt;li&gt;TypeScript&lt;/li&gt;
&lt;li&gt;C++&lt;/li&gt;
&lt;li&gt;Java&lt;/li&gt;
&lt;li&gt;Go&lt;/li&gt;
&lt;li&gt;Rust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Python + SQL + HTML + CSS + JavaScript will probably take you a very long way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Analysis
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgtznectxl94623w2t0de.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgtznectxl94623w2t0de.png" alt="DA"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tidyverse ecosystem&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dplyr&lt;/li&gt;
&lt;li&gt;tidyr&lt;/li&gt;
&lt;li&gt;readr&lt;/li&gt;
&lt;li&gt;ggplot2&lt;/li&gt;
&lt;li&gt;shiny&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plus a whole bunch of packages not included in image. Just use "Tidyverse". R tends to have individual libraries for doing just about any stats. While I cannot list them all, Tidyverse should be central.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pandas ecosystem&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pandas&lt;/li&gt;
&lt;li&gt;numpy&lt;/li&gt;
&lt;li&gt;scipy&lt;/li&gt;
&lt;li&gt;statsmodels&lt;/li&gt;
&lt;li&gt;matplotlib&lt;/li&gt;
&lt;li&gt;seaborn&lt;/li&gt;
&lt;li&gt;altair*&lt;/li&gt;
&lt;li&gt;plotly**&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pandas ecosystem is the Python equivalent to Tidyverse of R.&lt;/p&gt;

&lt;p&gt;*Altair really belongs in the Vega family (JavaScript-based, and D3.js based), which the altair provides the Python API for.&lt;/p&gt;

&lt;p&gt;**Unlike R (ggplot), viz tools in Python does not have a king (yet). I included plotly in the pandas ecosystem in the list, but it is an ecosystem on its own and extends beyond just python. Sorry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Engineering
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj9c36s2e5mg1w7sc433e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj9c36s2e5mg1w7sc433e.png" alt="DE"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spark&lt;/li&gt;
&lt;li&gt;Dask&lt;/li&gt;
&lt;li&gt;Airflow&lt;/li&gt;
&lt;li&gt;Prefect&lt;/li&gt;
&lt;li&gt;Kafka&lt;/li&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;MySQL&lt;/li&gt;
&lt;li&gt;MongoDB&lt;/li&gt;
&lt;li&gt;Cassandra&lt;/li&gt;
&lt;li&gt;Elastic&lt;/li&gt;
&lt;li&gt;Presto&lt;/li&gt;
&lt;li&gt;Redis&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Web Development
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fi6uc7uk7s8kntdfh8y64.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fi6uc7uk7s8kntdfh8y64.png" alt="WD"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Svelte&lt;/li&gt;
&lt;li&gt;Vue&lt;/li&gt;
&lt;li&gt;Angular&lt;/li&gt;
&lt;li&gt;React&lt;/li&gt;
&lt;li&gt;D3.js&lt;/li&gt;
&lt;li&gt;GraphQL (Graphene, Apollo etc.)&lt;/li&gt;
&lt;li&gt;ExpressJS&lt;/li&gt;
&lt;li&gt;NodeJS&lt;/li&gt;
&lt;li&gt;Flask&lt;/li&gt;
&lt;li&gt;SQLAlchemy&lt;/li&gt;
&lt;li&gt;Heroku&lt;/li&gt;
&lt;li&gt;Firebase&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cloud DevOps
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fc58vhur1zfyqre6v2yr2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fc58vhur1zfyqre6v2yr2.png" alt="DO"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS&lt;/li&gt;
&lt;li&gt;Azure&lt;/li&gt;
&lt;li&gt;GCP&lt;/li&gt;
&lt;li&gt;Docker&lt;/li&gt;
&lt;li&gt;Kubernetes&lt;/li&gt;
&lt;li&gt;Jenkins&lt;/li&gt;
&lt;li&gt;Ansible&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Machine Learning
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9ipq87s38smpps45m77k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F9ipq87s38smpps45m77k.png" alt="ML"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mlr&lt;/li&gt;
&lt;li&gt;Scikit-Learn&lt;/li&gt;
&lt;li&gt;Keras (absorbed into TensorFlow)&lt;/li&gt;
&lt;li&gt;TensorFlow&lt;/li&gt;
&lt;li&gt;PyTorch&lt;/li&gt;
&lt;li&gt;OpenCV&lt;/li&gt;
&lt;li&gt;spaCy&lt;/li&gt;
&lt;li&gt;H2O.ai&lt;/li&gt;
&lt;li&gt;OpenAI&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Business Intelligence
&lt;/h3&gt;

&lt;p&gt;(Or.. all-in-ones?)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Faan8y3wp3t0gxzdxvn98.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Faan8y3wp3t0gxzdxvn98.png" alt="BI"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Excel&lt;/li&gt;
&lt;li&gt;SAS&lt;/li&gt;
&lt;li&gt;PowerBI&lt;/li&gt;
&lt;li&gt;Tableau&lt;/li&gt;
&lt;li&gt;Knime&lt;/li&gt;
&lt;li&gt;Alteryx&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>datascience</category>
      <category>career</category>
      <category>discuss</category>
    </item>
  </channel>
</rss>
