<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Benjamin Scholtz</title>
    <description>The latest articles on DEV Community by Benjamin Scholtz (@benscholtz).</description>
    <link>https://dev.to/benscholtz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1270544%2F9c038e30-7520-4a49-a28f-e5c4a3df7f88.jpeg</url>
      <title>DEV Community: Benjamin Scholtz</title>
      <link>https://dev.to/benscholtz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/benscholtz"/>
    <language>en</language>
    <item>
      <title>From Punch Cards to the "Modern Data Stack"</title>
      <dc:creator>Benjamin Scholtz</dc:creator>
      <pubDate>Wed, 31 Jan 2024 21:35:07 +0000</pubDate>
      <link>https://dev.to/benscholtz/from-punch-cards-to-the-modern-data-stack-f0l</link>
      <guid>https://dev.to/benscholtz/from-punch-cards-to-the-modern-data-stack-f0l</guid>
      <description>&lt;p&gt;A journey from the origins of computing and data analytics to what we now call the "Modern Data Stack". What comes next?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Origins of Computing and Data Analytics
&lt;/h2&gt;

&lt;p&gt;The origins of computing and data analytics began in the mid-1950s and started taking shape with the introduction of SQL in 1970:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1954: &lt;strong&gt;Natural Language Processing (NLP)&lt;/strong&gt; - “Georgetown-IBM experiment”, machine translation of Russian to English&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxmp17zpuufe9gw8f7ww.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzxmp17zpuufe9gw8f7ww.png" alt="“Georgetown-IBM experiment”, machine translation of Russian to English" width="702" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgdtfxcye6eukhztgt5rj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgdtfxcye6eukhztgt5rj.png" alt="“Georgetown-IBM experiment”, machine translation of Russian to English" width="307" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1960: &lt;strong&gt;Punch Cards&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fame3dyk09cblorujq82x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fame3dyk09cblorujq82x.png" alt="Punch Cards" width="800" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1970: &lt;strong&gt;Structured Query Language (SQL)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;1970s: &lt;strong&gt;Interactive Financial Planning Systems&lt;/strong&gt; - Create a language to “allow executives to build models without intermediaries”&lt;/li&gt;
&lt;li&gt;1972: &lt;strong&gt;C, LUNAR&lt;/strong&gt; - One of the earliest applications of modern computing, a natural language information retrieval system, helped geologists access, compare and evaluate chemical-analysis data on moon rock and soil composition&lt;/li&gt;
&lt;li&gt;1975: &lt;strong&gt;Express&lt;/strong&gt; - The first Online Analytical Processing (OLAP) system, intended to analyse business data from different points of view&lt;/li&gt;
&lt;li&gt;1979: &lt;strong&gt;VisiCalc&lt;/strong&gt; - The first spreadsheet computer program&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fto0z4r2g13x3srxy7gnn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fto0z4r2g13x3srxy7gnn.png" alt="VisiCalc" width="560" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1980s: &lt;strong&gt;Group Decision Support Systems&lt;/strong&gt; - “Computerized Collaborative Work System”&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The “Modern Data Stack”
&lt;/h2&gt;

&lt;p&gt;The "Modern Data Stack" is a set of technologies and tools used to collect, store, process, analyse, and visualise data in a well-integrated cloud-based platform. Although QlikView was pre-cloud, it is the earliest example of what most would recognise as an analytics dashboard used by modern platforms like Tableau and PowerBI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1994: &lt;strong&gt;QlikView&lt;/strong&gt; - “Dashboard-driven Analytics”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkb7gpcwbm42qbcbfnj2c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkb7gpcwbm42qbcbfnj2c.png" alt="QlikView" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2003: &lt;strong&gt;Tableau&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;2009: &lt;strong&gt;Wolfram Alpha&lt;/strong&gt; - “Computational Search Engine”&lt;/li&gt;
&lt;li&gt;2015: &lt;strong&gt;PowerBI&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;2017: &lt;strong&gt;ThoughtSpot&lt;/strong&gt; - “Search-driven Analytics”&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Paper, Query Languages, Spreadsheets, Dashboards, Search, what next?
&lt;/h2&gt;

&lt;p&gt;Some of the most innovative analytics applications, at least in terms of user experience, convert human language to some computational output:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-SQL:&lt;/strong&gt; A tale as old as time, LUNAR was first developed in the 70s to help geologists access, compare, and evaluate chemical-analysis data using natural language. Salesforce WikiSQL introduced the first extensive compendium of data built for the text-to-SQL use case but only contained simple SQL queries. The Yale Spider dataset introduced a benchmark for more complex queries, and most recently, BIRD introduced real-world “dirty” queries and efficiency scores to create a proper benchmark for text-to-SQL applications.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-Computational-Language:&lt;/strong&gt; Wolfram Alpha, ThoughtSpot&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-Code:&lt;/strong&gt; ChatGPT Advanced Data Analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Is "Conversation-Driven Data Analytics" a natural evolution?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UX of modern analytics interfaces like &lt;strong&gt;search and chat are evolving&lt;/strong&gt;, becoming more intuitive, enabled by NLP and LLMs&lt;/li&gt;
&lt;li&gt;Analytics interfaces have origins in &lt;strong&gt;enabling decision-makers&lt;/strong&gt;, but decision-makers are still largely &lt;strong&gt;reliant on data analysts&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Many decision-maker &lt;strong&gt;queries are ad-hoc&lt;/strong&gt;, best suited to “throwaway analytics”&lt;/li&gt;
&lt;li&gt;Insight generation is a &lt;strong&gt;creative process&lt;/strong&gt; where many insights are gained in conversations about data, possibly with peers&lt;/li&gt;
&lt;li&gt;The data analytics &lt;strong&gt;workflow is disjointed&lt;/strong&gt;, from the imagination of analysis to the presentation of results&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Acknowledgements
&lt;/h2&gt;

&lt;p&gt;All images are sourced from &lt;a href="https://commons.wikimedia.org/wiki/Main_Page"&gt;Wikimedia&lt;/a&gt; and licensed under the &lt;a href="https://creativecommons.org/licenses/by/2.0/deed.en"&gt;Creative Commons Attribution 2.0 Generic&lt;/a&gt; license.&lt;/p&gt;

&lt;p&gt;Dates for the section "The Origins of Computing and Data Analytics" thanks to &lt;a href="https://web.paristech.com/hs-fs/file-2487731396.pdf"&gt;https://web.paristech.com/hs-fs/file-2487731396.pdf&lt;/a&gt; and &lt;a href="http://dssresources.com/history/dsshistoryv28.html"&gt;http://dssresources.com/history/dsshistoryv28.html&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
