<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Scott Paulin</title>
    <description>The latest articles on DEV Community by Scott Paulin (@scottgpaulin).</description>
    <link>https://dev.to/scottgpaulin</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F947329%2Fec1796ee-66a5-4532-97df-b8aeea3466e4.jpeg</url>
      <title>DEV Community: Scott Paulin</title>
      <link>https://dev.to/scottgpaulin</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/scottgpaulin"/>
    <language>en</language>
    <item>
      <title>Why I Built Parquet Data</title>
      <dc:creator>Scott Paulin</dc:creator>
      <pubDate>Sat, 01 Jul 2023 08:57:23 +0000</pubDate>
      <link>https://dev.to/scottgpaulin/why-i-built-parquet-data-4mh2</link>
      <guid>https://dev.to/scottgpaulin/why-i-built-parquet-data-4mh2</guid>
      <description>&lt;p&gt;Parquet is my favourite format for storing tabular data.&lt;/p&gt;

&lt;p&gt;Parquet compresses well, it has a strong schema, and it's efficient to analyze. These are all things that CSV, the defacto standard for storing tabular data, lacks.&lt;/p&gt;

&lt;p&gt;For all of the benefits of Parquet, there are some clear downsides. &lt;/p&gt;

&lt;p&gt;Parquet is a binary format. To see inside a Parquet file you need to either write some code to parse it, or use a specialized &lt;a href="https://www.parquetdata.com/view"&gt;Parquet Viewer&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Without a Parquet Viewer, the easiest way to look inside a Parquet file is to &lt;a href="https://www.parquetdata.com/convert/parquet/csv"&gt;convert to CSV&lt;/a&gt; and then open in a regular text editor or &lt;a href="https://www.parquetdata.com/convert/parquet/excel"&gt;Excel&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;CSV files are interoperable, they are understood by many kinds of software, but they are harder to analyse than Parquet files. Plus many text editors struggle to load a CSV file bigger than 20MB. It would be great to keep data in Parquet format while doing analysis and debugging.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://www.parquetdata.com/"&gt;Parquet Data&lt;/a&gt; because I love the Parquet format, but had trouble working with it. I made the &lt;a href="https://www.parquetdata.com/view"&gt;Parquet Viewer&lt;/a&gt; and &lt;a href="https://www.parquetdata.com/convert"&gt;converters&lt;/a&gt; to make it easier to debug some code that I was working on.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Data Science in your browser - introducing We Do Data Science</title>
      <dc:creator>Scott Paulin</dc:creator>
      <pubDate>Mon, 17 Oct 2022 22:03:34 +0000</pubDate>
      <link>https://dev.to/scottgpaulin/data-science-in-your-browser-introducing-we-do-data-science-36dh</link>
      <guid>https://dev.to/scottgpaulin/data-science-in-your-browser-introducing-we-do-data-science-36dh</guid>
      <description>&lt;p&gt;I love writing code, but it got painful starting a new software project every time I wanted to make some graphs or do analysis.&lt;/p&gt;

&lt;p&gt;I tried some free online tools, but couldn't find anything I really liked.&lt;/p&gt;

&lt;p&gt;Some even wanted me to give them my phone number to start a free account. &lt;/p&gt;

&lt;p&gt;Protip: to make your number believable, just input your actual phone number with 1 or 2 numbers changed (also works great in bars).&lt;/p&gt;

&lt;p&gt;One day I even tried downloading Tableau. 20 minutes later and 2 email confirmations later I was messaging support to cancel my account (they don't let you do this by yourself).&lt;/p&gt;

&lt;p&gt;So I created &lt;a href="https://www.wedodatascience.com/"&gt;We Do Data Science&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--A9hAtogF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wbi31fqj4j9a5uf3ro7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--A9hAtogF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wbi31fqj4j9a5uf3ro7w.png" alt="Image description" width="880" height="660"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We Do Data Science lets you make &lt;a href="https://www.wedodatascience.com/graphs"&gt;graphs&lt;/a&gt; in your browser. &lt;/p&gt;

&lt;p&gt;My personal favourite are &lt;a href="https://www.wedodatascience.com/graphs/pie/nightingale"&gt;Nightingale&lt;/a&gt; graphs. I never knew about them until I started making this website.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xvM3eUEs--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/15eghm5afvugprg89756.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xvM3eUEs--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/15eghm5afvugprg89756.png" alt="Image description" width="880" height="653"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I make graphs sometimes I also want to do data analysis. Like performing &lt;a href="https://www.wedodatascience.com/regression"&gt;regression&lt;/a&gt; or calculating &lt;a href="https://www.wedodatascience.com/correlation"&gt;correlations&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So I built regression calculators&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ToYqwcsN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tv849cq04na0oo0fzjjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ToYqwcsN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/tv849cq04na0oo0fzjjx.png" alt="Image description" width="880" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;and correlation calculators&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--13cOIrx2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r0w4jzlcu6j96kx070na.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--13cOIrx2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r0w4jzlcu6j96kx070na.png" alt="Image description" width="880" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I never knew there were different kinds of correlation until I made this site, cool huh.&lt;/p&gt;

&lt;p&gt;When I work with data sometimes I need to convert file types. There are lots of great conversion tools online, but none of them &lt;a href="https://www.wedodatascience.com/datasets/convert/parquet"&gt;converted parquet&lt;/a&gt; files.&lt;/p&gt;

&lt;p&gt;So I made a &lt;a href="https://www.wedodatascience.com/datasets/convert/csv/parquet"&gt;csv to parquet&lt;/a&gt; converter and a &lt;a href="https://www.wedodatascience.com/datasets/convert/parquet/csv"&gt;parquet to csv&lt;/a&gt; converter.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xjhoMljY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0unxg8dg6cl4t9aivuwd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xjhoMljY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0unxg8dg6cl4t9aivuwd.png" alt="Image description" width="880" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I even got interested in statistics and made some &lt;a href="https://www.wedodatascience.com/statistics"&gt;statistics calculators&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jYLeM1-z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0t9yy9ff8ewzvwn9djgk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jYLeM1-z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0t9yy9ff8ewzvwn9djgk.png" alt="Image description" width="880" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anyway. If you do try out &lt;a href="https://www.wedodatascience.com/"&gt;We Do Data Science&lt;/a&gt; then I hope you enjoy using it as much as I do.&lt;/p&gt;

&lt;p&gt;If something isn't working, or you want to get in touch then you can find me on Twitter with the handle scottgpaulin.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
