<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jonathan Powell</title>
    <description>The latest articles on DEV Community by Jonathan Powell (@itsjjpowell).</description>
    <link>https://dev.to/itsjjpowell</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F135309%2Fe1e92ab9-5cfe-4f7f-adea-f8273d8ea344.jpg</url>
      <title>DEV Community: Jonathan Powell</title>
      <link>https://dev.to/itsjjpowell</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/itsjjpowell"/>
    <language>en</language>
    <item>
      <title>How OpenAI Scales Postgres to +1M Queries Per Second</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Mon, 30 Jun 2025 21:04:32 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/how-openai-uses-postgres-at-scale-45c9</link>
      <guid>https://dev.to/itsjjpowell/how-openai-uses-postgres-at-scale-45c9</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;OpenAI is one of the companies at the forefront of the AI movement. ChatGPT is one of the most popular LLM applications, and grows everyday. OpenAI has to handle millions of queries per second, and uses Postgres for critical workloads. Their use of Postgres isn’t a surprise, Postgres is one of the most popular relational databases. What is surprising, however, is how they optimize Postgres to support their needs. Most of their optimizations are based on battle tested best practices.&lt;/p&gt;

&lt;p&gt;In a recent talk titled “Scaling Postgres for OpenAI”, OpenAI Technical Staff shared how they scaled Postgres for their use-cases. My takeaway from the talk is that they aren’t using exotic optimizations to get the best performance. They’re monitoring their database, finding bottlenecks, and using best practices to fix issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context
&lt;/h2&gt;

&lt;p&gt;Here’s the context. OpenAI has a single primary database and several read replicas. At the start of their performance journey, the primary served reads and writes, while the read replicas served a subset of read queries. Over time, they had several incidents related to handling high traffic, and spent time hardening the database.&lt;/p&gt;

&lt;p&gt;The talk has more in-depth analysis, but I want to highlight how they approach performance. Every optimization they made is in service of reducing load on their primary database as much as possible. &lt;/p&gt;

&lt;h2&gt;
  
  
  Techniques
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Review Your Application for Unneeded Queries
&lt;/h3&gt;

&lt;p&gt;The team profiled their application and found code paths with read queries, that didn't need to be made. Removing queries from these code paths reduced the number of queries that hit the primary database.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Audit ORM Queries and use raw SQL where needed
&lt;/h3&gt;

&lt;p&gt;Object Relation Mapping (ORM) libraries let us interact with the database without writing a single line of SQL. We get to focus on the relationships between our data rather than how to write the correct joins. The tradeoff is that the ORM may generate complex SQL, when simpler queries suffice. The OpenAI team reviewed the generated SQL from their ORM, and replaced over-complicated queries with raw SQL where it made sense.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Move Read Queries to the Read Replicas
&lt;/h3&gt;

&lt;p&gt;To reduce load on their primary database, the OpenAI team moved as many read queries to read replicas as possible. There is a consistency tradeoff when moving read queries to read replicas. That is, replicas may take some time to be updated with the latest changes from the primary, especially if replica lag is high. However, reducing load on the primary by an order of magnitude was an acceptable tradeoff for potential impacts to consistency. &lt;/p&gt;

&lt;h3&gt;
  
  
  4. Use Timeouts For Queries
&lt;/h3&gt;

&lt;p&gt;Long running queries may hold onto connections and prevent other transactions from being processed. Configuring statement and transaction timeouts ensures the database won’t fail to process other queries due to a few long-running transactions. These can be applied at the application level, or in Postgres itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Use Connection Pools to optimize connection usage
&lt;/h3&gt;

&lt;p&gt;The OpenAI team introduced PgBouncer, a connection pooler for Postgres, to their system to optimize connection pool usage. Connection Pooling is a common technique to efficiently manage the number of connections an application opens to a database. Most database libraries provide connection pooling at the application level. Your database library will manage the connections and re-use them as efficiently as possible. PgBouncer takes it one level lower to the database server. App Servers connect to the PgBouncer proxy just like the would a database, and PgBouncer creates or re-uses connections as efficiently as possible. Whether you choose to go with PgBouncer, or use built-in functionality from your DB library, connection pooling is always a good idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Database performance doesn't have to be complicated. Even hyper-scale AI companies lean on best practices to get the performance they're looking for. &lt;/p&gt;

&lt;p&gt;To learn more about how OpenAI scaled their postgres instances give the talks a view here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full Talk (50 minutes): &lt;a href="https://www.youtube.com/watch?v=Ni1SGhNu-Q4" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=Ni1SGhNu-Q4&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Lightning Talk Version (25 minutes): &lt;a href="https://www.youtube.com/watch?v=NvY2kvi1Fa0" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=NvY2kvi1Fa0&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Leave a comment about how you've improved database performance in your systems!&lt;/p&gt;

&lt;p&gt;And here are some related resources about PgBouncer and Postgres Config:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PgBouncer: &lt;a href="https://www.pgbouncer.org/usage.html" rel="noopener noreferrer"&gt;https://www.pgbouncer.org/usage.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Postgres Client Configuration: &lt;a href="https://www.postgresql.org/docs/current/runtime-config-client.html" rel="noopener noreferrer"&gt;https://www.postgresql.org/docs/current/runtime-config-client.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>databasess</category>
      <category>openai</category>
      <category>ai</category>
      <category>postgres</category>
    </item>
    <item>
      <title>We Have Code Quality At Home: Open Source Java Code Quality Tools</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Tue, 07 May 2024 03:13:03 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/we-have-code-quality-at-home-open-source-java-code-quality-tools-4i84</link>
      <guid>https://dev.to/itsjjpowell/we-have-code-quality-at-home-open-source-java-code-quality-tools-4i84</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Code Quality&lt;/li&gt;
&lt;li&gt;The Tools&lt;/li&gt;
&lt;li&gt;Spotless&lt;/li&gt;
&lt;li&gt;SpotBugs&lt;/li&gt;
&lt;li&gt;PMD&lt;/li&gt;
&lt;li&gt;Putting It Together&lt;/li&gt;
&lt;li&gt;Links&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best tool for the job isn't always the best tool you have available.&lt;/p&gt;

&lt;p&gt;Engineering teams are often have the build vs. buy conversation. Do we want to "buy" an off-the-shelf tool to solve our problem or do we want to build it in-house? The conversation is usually for complex engineering solutions, but this time we're talking about code quality tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Quality
&lt;/h2&gt;

&lt;p&gt;I think code quality is more about standardizing practices across a team than writing "clean code". Everyone has their own conceptions of best practices, so using code quality tools like linters, and static analysis tools standardize practices across teams. Even better, they can be integrated to the pull request, and CI/CD process to automate the practice of "code quality" - letting developers focus on reviewing the business logic of code changes.&lt;/p&gt;

&lt;p&gt;You get a few benefits of integrating automated code quality tools in your projects:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Standardized (and automated) best practices&lt;/li&gt;
&lt;li&gt;Easier onboarding of new team members coming from different tech stacks&lt;/li&gt;
&lt;li&gt;Evangelize good practices across the codebase&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Tools
&lt;/h2&gt;

&lt;p&gt;SonarQube is one of the most common off-the-shelf code quality solutions, but I'm going to highlight 3 open source java plugins for "good enough" code quality in your Java projects: Spotless, SpotBugs, and PMD&lt;/p&gt;

&lt;h3&gt;
  
  
  Spotless
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/diffplug/spotless/blob/main/README.md" rel="noopener noreferrer"&gt;Spotless&lt;/a&gt; is an open-source, multi-language, customizable code formatter for projects.&lt;br&gt;
It comes with a &lt;a href="https://github.com/diffplug/spotless/blob/main/plugin-maven/README.md#quickstart" rel="noopener noreferrer"&gt;Maven Plugin&lt;/a&gt; that can be customized as needed.&lt;/p&gt;

&lt;p&gt;Here's an example of what it would look like. Spotless comes with some built-in formats. In this example, we specify we want to use Palantir's Java formatting rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;      &lt;span class="nt"&gt;&amp;lt;plugin&amp;gt;&lt;/span&gt;
        &lt;span class="c"&gt;&amp;lt;!-- https://github.com/diffplug/spotless --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.diffplug.spotless&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spotless-maven-plugin&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;${spotless.version}&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;configuration&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;java&amp;gt;&lt;/span&gt;
            &lt;span class="c"&gt;&amp;lt;!-- Tell Spotless to Use Palantir's Custom Java Format--&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;palantirJavaFormat&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;2.39.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
              &lt;span class="c"&gt;&amp;lt;!-- optional --&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;style&amp;gt;&lt;/span&gt;PALANTIR&lt;span class="nt"&gt;&amp;lt;/style&amp;gt;&lt;/span&gt;
              &lt;span class="c"&gt;&amp;lt;!-- or AOSP/GOOGLE (optional) --&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;formatJavadoc&amp;gt;&lt;/span&gt;false&lt;span class="nt"&gt;&amp;lt;/formatJavadoc&amp;gt;&lt;/span&gt;
              &lt;span class="c"&gt;&amp;lt;!-- defaults to false (optional, requires at least Palantir 2.39.0) --&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;/palantirJavaFormat&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;/java&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/configuration&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/plugin&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the plugin is added to the pom, you can check for formatting issues with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn spotless:check
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or format your code with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn spotless:apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  SpotBugs
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://spotbugs.github.io/" rel="noopener noreferrer"&gt;SpotBugs&lt;/a&gt; is an open source static anlysis tool. "SpotBugs uses static analysis to inspect Java bytecode for occurrences of bug patterns." This means that SpotBugs runs against the compiled source source code, rather than raw Java files. Because it analyses bytecode, it can catch some types of bugs that source code analysis would not catch.&lt;/p&gt;

&lt;p&gt;SpotBugs provides a &lt;a href="https://spotbugs.github.io/spotbugs-maven-plugin/#:~:text=SpotBugs%20uses%20static%20analysis%20to,do%20not%20indicate%20real%20errors." rel="noopener noreferrer"&gt;maven plugin&lt;/a&gt; for easy usage. Here's what it would look like in a pom.xml:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;      &lt;span class="nt"&gt;&amp;lt;plugin&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.github.spotbugs&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;spotbugs-maven-plugin&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;4.8.4.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;configuration&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;plugins&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;plugin&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;com.h3xstream.findsecbugs&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;findsecbugs-plugin&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;1.12.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;/plugin&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;/plugins&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/configuration&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/plugin&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once added, you can run SpotBugs from the command line. I've include &lt;code&gt;mvn clean compile&lt;/code&gt; because you want to ensure you've compiled the latest version of your code.&lt;br&gt;
It also comes with a nice GUI if you want an easy way to know which classes have issues.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate a report for SpotBugs errors&lt;/span&gt;
mvn clean compile spotbugs:spotbugs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check for SpotBugs errors&lt;/span&gt;
mvn clean compile spotbugs:check
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mvn clean compile spotbugs:gui
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to suppress a SpotBugs Warning, you can use the @SuppressFBWarnings annotation with the SpotBugs errors as a parameter(s)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@SuppressFBWarnings&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;={&lt;/span&gt;&lt;span class="s"&gt;"NM_METHOD_NAMING_CONVENTION"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"NP_TOSTRING_COULD_RETURN_NULL"&lt;/span&gt;&lt;span class="o"&gt;},&lt;/span&gt;&lt;span class="n"&gt;justification&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"This is why we ignore these FindBugs warnings."&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyClassWithSpotBugsWarnings&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SpotBugs will not automatically fix any errors it finds. It's up to the developer to decide if it's findings warrant code changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  PMD
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://pmd.github.io/" rel="noopener noreferrer"&gt;PMD&lt;/a&gt; is a source code static analysis tool. It inspects your Java files for any issues, and has a configurable set of rules&lt;br&gt;
to look at.&lt;/p&gt;

&lt;p&gt;PMD provides a &lt;a href="https://maven.apache.org/plugins/maven-pmd-plugin/index.html" rel="noopener noreferrer"&gt;maven plugin&lt;/a&gt; to easily setup rules you care about. Here's what it &lt;br&gt;
would look like in a pom.xml&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;      &lt;span class="nt"&gt;&amp;lt;plugin&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;groupId&amp;gt;&lt;/span&gt;org.apache.maven.plugins&lt;span class="nt"&gt;&amp;lt;/groupId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;artifactId&amp;gt;&lt;/span&gt;maven-pmd-plugin&lt;span class="nt"&gt;&amp;lt;/artifactId&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;version&amp;gt;&lt;/span&gt;3.22.0&lt;span class="nt"&gt;&amp;lt;/version&amp;gt;&lt;/span&gt;
        &lt;span class="c"&gt;&amp;lt;!-- Link to a custom ruleset file for rules we care about --&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;configuration&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;rulesets&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;ruleset&amp;gt;&lt;/span&gt;src/main/resources/pmd/custom-ruleset.xml&lt;span class="nt"&gt;&amp;lt;/ruleset&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;/rulesets&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;excludes&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;exclude&amp;gt;&lt;/span&gt;**/test/**&lt;span class="nt"&gt;&amp;lt;/exclude&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;/excludes&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/configuration&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/plugin&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can find a list of availble rules on the &lt;a href="https://pmd.github.io/pmd/pmd_rules_java.html" rel="noopener noreferrer"&gt;docs site&lt;/a&gt; and include/exclude them as needed.&lt;/p&gt;

&lt;p&gt;Once added to the pom.xml, you can run PMD:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate a report of PMD violations&lt;/span&gt;
mvn pmd:pmd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check the code for any PMD violations&lt;/span&gt;
mvn pmd:check
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to suppress a PMD violation you can do it like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@SuppressWarnings&lt;/span&gt;&lt;span class="o"&gt;({&lt;/span&gt;&lt;span class="s"&gt;"PMD.UnusedPrivateField"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PMD.SingularField"&lt;/span&gt;&lt;span class="o"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyClassWithPMDViolations&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Your code here&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Like SpotBugs, PMD will not automatically fix issues. It's up to the developer to decide if violations warrant changes, or ignore them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Putting it together
&lt;/h3&gt;

&lt;p&gt;Together these tools give you linting and static analysis that you can customize to your needs.&lt;br&gt;
If you want to see a sample of them in action, you can find a sample maven project I made, complete with fully configured pom.xml file here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/jpowell96/j-codequality-starter" rel="noopener noreferrer"&gt;https://github.com/jpowell96/j-codequality-starter&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Links
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/jpowell96/j-codequality-starter" rel="noopener noreferrer"&gt;Sample Project&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/diffplug/spotless" rel="noopener noreferrer"&gt;Spotless Github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://spotbugs.readthedocs.io/en/latest/introduction.html" rel="noopener noreferrer"&gt;SpotBugs Main Page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://spotbugs.github.io/spotbugs-maven-plugin/" rel="noopener noreferrer"&gt;SpotBugs Maven Plugin Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pmd.github.io/" rel="noopener noreferrer"&gt;PMD Main Page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pmd.github.io/pmd/pmd_rules_java.html" rel="noopener noreferrer"&gt;PMD Java Rules&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://maven.apache.org/plugins/maven-pmd-plugin/index.html" rel="noopener noreferrer"&gt;PMD Maven Plugin Docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>java</category>
      <category>linter</category>
      <category>cleancode</category>
    </item>
    <item>
      <title>Passing The DataCamp SQL Associate Certificate</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Wed, 10 Apr 2024 20:11:05 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/passing-the-datacamp-sql-associate-certificate-jcc</link>
      <guid>https://dev.to/itsjjpowell/passing-the-datacamp-sql-associate-certificate-jcc</guid>
      <description>&lt;p&gt;I just passed the DataCamp SQL Associate Exam. Here's a short list of concepts you should be comfortable with for the exam, and some notes I have about the exam itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Timed Quiz Portion
&lt;/h2&gt;

&lt;p&gt;The timed quiz portion of the exam tests your knowledge on SQL syntax, queries, data manipulation, and general knowledge database management concepts like normalization, data modeling.&lt;/p&gt;

&lt;p&gt;Here's a shortlist the SQL concept(s) covered. If you're familiar with the terms, or have used them at least once, you'll be able to answer most questions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Joins: LEFT JOIN, FULL JOIN, INNER JOIN, UNION, UNION ALL&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;String Manipulation: UPPER, LOWER, LIKE, ILIKE, ~, REGEX, TRIM, LEFT, RIGHT&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Date Manipulation: DATE_PART, DATE_TRUNC, TO_CHAR&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Number Manipulation: ROUND, TRUNC&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AGGREGATE FUNCTIONS: MAX, MIN, AVG, percentiles&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The more general database questions may require taking the DataCamp courses because they seem to expect you to use the same language that instructors use in those courses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Timed Practical Exam Portion
&lt;/h2&gt;

&lt;p&gt;The timed practical exam is a mutli-part question where you write queries for some provided tables.&lt;/p&gt;

&lt;p&gt;Most of the queries require simple joins, where clauses, and occasionally manipulating data such as missing values.&lt;/p&gt;

&lt;p&gt;DataCamp provides a practical exam that's a good reference point for if you're ready. If you can complete the practice exam, you are more than prepared to complete the practical exam. &lt;/p&gt;

&lt;p&gt;When you take the practical you're given 4 hours to complete it. If you've done the practice exam, you can complete much faster than in 4 hours.&lt;/p&gt;

&lt;p&gt;You'll need to be able to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Join tables together&lt;/li&gt;
&lt;li&gt;Use subqueries and where clauses to filter out data&lt;/li&gt;
&lt;li&gt;Manipulate data to handle missing values, or set default values for columns&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Preparing for the Certification Exam
&lt;/h2&gt;

&lt;p&gt;Based on the exam(s) and the courses I've taken, these DataCamp courses reflect the content that I saw on the exam:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL for Business Analysts&lt;/li&gt;
&lt;li&gt;Functions for Manipulating Data in PostgresSQL&lt;/li&gt;
&lt;li&gt;Manipulating Data in SQL&lt;/li&gt;
&lt;li&gt;Database Design&lt;/li&gt;
&lt;li&gt;Joining Data in SQL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want more SQL practice outside of DataCamp courses, I've used DataLemur, and StrataScratch to practice SQL data analysis questions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Taking the Exam
&lt;/h2&gt;

&lt;p&gt;Overall, I thought the exam felt like a survey of beginner to intermediate SQL knowledge. The timed exam has a handful of tricky questions, but overall a general understanding of SQL should allow you to a pass. The practical exam is also beginner to intermediate level SQL. If you know how to join tables, use filter clauses like WHERE and IN, manipulate text and numbers, and set default values, you can pass the exam. &lt;/p&gt;

</description>
      <category>sql</category>
      <category>datacamp</category>
      <category>certification</category>
    </item>
    <item>
      <title>Time Series Analysis on FDIC Bank Failures</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Wed, 10 Apr 2024 13:21:44 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/time-series-analysis-on-fdic-bank-failures-1408</link>
      <guid>https://dev.to/itsjjpowell/time-series-analysis-on-fdic-bank-failures-1408</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Intro&lt;/li&gt;
&lt;li&gt;Readout&lt;/li&gt;
&lt;li&gt;Importing The Data&lt;/li&gt;
&lt;li&gt;Querying The Data&lt;/li&gt;
&lt;li&gt;Analyzing the Data&lt;/li&gt;
&lt;li&gt;Wrapping It Up&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FDIC Bank Failures - Time Series Analysis
&lt;/h2&gt;

&lt;p&gt;My latest endeavor with SQL is working with timeseries data.&lt;br&gt;
For this mini-analysis, I decided to chart out US bank failures overtime.&lt;/p&gt;

&lt;p&gt;I found an &lt;a href="https://catalog.data.gov/dataset/fdic-failed-bank-list" rel="noopener noreferrer"&gt;official dataset of FDIC Bank Failures&lt;/a&gt; on data.gov, which has public&lt;br&gt;
datasets from the US governemnt. It lists all the banks that have failed since 2000.&lt;/p&gt;
&lt;h2&gt;
  
  
  Readout
&lt;/h2&gt;

&lt;p&gt;Over the time period from 2000 to 2023, there were 568 banks that closed. Georiga, Florida, and Illinois had the most closers over the time period with 93, 76, and 69 closures respectively. On a year basis, Florida had the most closures in a single year with 29 banks closing in 2010.&lt;/p&gt;

&lt;p&gt;As expected, bank closures trend upwards between 2008 - 2012, peaking with 157 closures in 2010. The year over year changes highlight this explosion of failures.&lt;/p&gt;

&lt;p&gt;From 2007 to 2008, the number of bank closures grew over 8x from 3 closures to 25 closures, and grew even more the following year - from 25 closures in 2008 to 140 in 2009 - a 460% increase. Closures in a single year peaked in 2010 with 157 closures with Florida having the most closures with 29 bank clousres. July 2009 was the worst month for closures, with 29 closings in a single month.&lt;/p&gt;
&lt;h2&gt;
  
  
  Importing the Data
&lt;/h2&gt;

&lt;p&gt;To import the data into my Postgres Database I did the following steps&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a table to represent the rows in the CSV file
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;bank_name&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;cert&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;acquiring_institution&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;fund&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;Copy the data into the table from the csv file, using the &lt;a href="https://www.postgresql.org/docs/current/sql-copy.html" rel="noopener noreferrer"&gt;COPY command&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;bank_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;cert&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;acquiring_institution&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;fund&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'/path/to/banklist.csv'&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FORMAT&lt;/span&gt; &lt;span class="n"&gt;csv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;DELIMITER&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;QUOTE&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ESCAPE&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;ENCODING&lt;/span&gt; &lt;span class="s1"&gt;'UTF-8'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I originally set the ENCODING to ASCII, but after running into some issues realized I needed to set the encoding to 'UTF-8'. &lt;br&gt;
Now, with the correct character encoding, &lt;br&gt;
I imported my data for real but got this error message: &lt;/p&gt;

&lt;p&gt;&lt;code&gt;ERROR:  invalid byte sequence for encoding "UTF8": 0x96&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;After some digging online, I realized this meant there was a non UTF-8 character in the csv file. There was a hyphen character '-' for one of the banks was a special character. I replaced it with a "regular" hyphen - character and that allowed me to import all the data.&lt;/p&gt;

&lt;p&gt;Once imported, the data looked like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;id&lt;/th&gt;
&lt;th&gt;bank_name&lt;/th&gt;
&lt;th&gt;city&lt;/th&gt;
&lt;th&gt;state&lt;/th&gt;
&lt;th&gt;cert&lt;/th&gt;
&lt;th&gt;acquiring_institution&lt;/th&gt;
&lt;th&gt;closing_date&lt;/th&gt;
&lt;th&gt;fund&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;Citizens Bank&lt;/td&gt;
&lt;td&gt;Sac City&lt;/td&gt;
&lt;td&gt;IA&lt;/td&gt;
&lt;td&gt;8758&lt;/td&gt;
&lt;td&gt;Iowa Trust &amp;amp; Savings Bank&lt;/td&gt;
&lt;td&gt;2023-11-03&lt;/td&gt;
&lt;td&gt;10545&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;41&lt;/td&gt;
&lt;td&gt;Heartland Tri-State Bank&lt;/td&gt;
&lt;td&gt;Elkhart&lt;/td&gt;
&lt;td&gt;KS&lt;/td&gt;
&lt;td&gt;25851&lt;/td&gt;
&lt;td&gt;Dream First Bank, N.A.&lt;/td&gt;
&lt;td&gt;2023-07-28&lt;/td&gt;
&lt;td&gt;10544&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;42&lt;/td&gt;
&lt;td&gt;First Republic Bank&lt;/td&gt;
&lt;td&gt;San Francisco&lt;/td&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;59017&lt;/td&gt;
&lt;td&gt;JPMorgan Chase Bank, N.A.&lt;/td&gt;
&lt;td&gt;2023-05-01&lt;/td&gt;
&lt;td&gt;10543&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;td&gt;Signature Bank&lt;/td&gt;
&lt;td&gt;New York&lt;/td&gt;
&lt;td&gt;NY&lt;/td&gt;
&lt;td&gt;57053&lt;/td&gt;
&lt;td&gt;Flagstar Bank, N.A.&lt;/td&gt;
&lt;td&gt;2023-03-12&lt;/td&gt;
&lt;td&gt;10540&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;44&lt;/td&gt;
&lt;td&gt;Silicon Valley Bank&lt;/td&gt;
&lt;td&gt;Santa Clara&lt;/td&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;24735&lt;/td&gt;
&lt;td&gt;First-Citizens Bank &amp;amp; Trust Company&lt;/td&gt;
&lt;td&gt;2023-03-10&lt;/td&gt;
&lt;td&gt;10539&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  Querying the data
&lt;/h2&gt;

&lt;p&gt;My goal was to run a few simple queries to get the number of bank failures by year,&lt;br&gt;
and then another query to get all the bank failures by month.&lt;/p&gt;

&lt;p&gt;I started with a query to find all the months/years that banks had failures&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Return bank failures by month, year based on the dataset&lt;/span&gt;
 &lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;  
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; 
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To my surprise, there were some months missing from this query. There wasn't a row for months from January - October, 2000. &lt;/p&gt;

&lt;p&gt;It may seem odd, but this is the expected output. If there weren't any failures for a given month in a year, there wouldn't be rows. If I want there to be rows for every month from 2000 - 2023, I need to have date for those months too.&lt;/p&gt;

&lt;p&gt;To accomplish this I made a table called timescale, that contains all the months, years from January, 2000 - December 2023. My initial attempt was overcomplicated, but I realized PostgreSQL provides a helpful function, &lt;a href="https://www.postgresql.org/docs/current/functions-srf.html" rel="noopener noreferrer"&gt;generate_series()&lt;/a&gt; That lets me generate a series of rows between to dates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Generate a row for every month between Jan, 2000 and Dec, 2023&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 month'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Generate a row for every year between Jan, 2000 and Dec, 2023&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 year'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code will return a result set containing a row for every month from January 2000 until December 2023. &lt;/p&gt;

&lt;p&gt;With this table in mind, we can redo our original "Failures per Month" query and JOIN the timescale onto our bank_failures table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Return bank failures by month, year based on the dataset&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 month'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bank_failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;  
&lt;span class="c1"&gt;-- the series table has complete dates, so join bank_failures onto the series for complete data for the complete time period&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; 
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important Notes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The bank_failures table is left joined onto the timescale. This is done because the timescale
includes every month, year while the bank_failures table only includes records for month/years where a bank failed.&lt;/li&gt;
&lt;li&gt;We do a &lt;code&gt;COUNT(bank_failures.closing_date)&lt;/code&gt; rather than a  &lt;code&gt;COUNT(*)&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;  A left join will return a record for each row on the timescale table. COUNT(*) would 
return 1 failure, because it counts rows including NULLs. &lt;code&gt;COUNT(bank_failures.closing_date)&lt;/code&gt; ensures that we only count cases where there were bank failure in a given month.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Likewise, if we wanted to upsample the data to yearly buckets, the query would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 year'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bank_failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;  
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; 
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've updated the generate_series() function to use an interval of 1 year rather than 1 month, and&lt;br&gt;
updated our JOIN condition to be based on matching years.&lt;/p&gt;
&lt;h2&gt;
  
  
  Analyzing the Data
&lt;/h2&gt;

&lt;p&gt;I saved the queries as materialized views and ran a few queries on the data. I used a materialized view because I know that the data is for a fixed time period and will not need to be refreshed. A temp table, or even writing to an actual table would work as well.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;create&lt;/span&gt; &lt;span class="n"&gt;materialized&lt;/span&gt; &lt;span class="k"&gt;view&lt;/span&gt; &lt;span class="n"&gt;bank_failures_by_month&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 month'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bank_failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;  
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; 
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'MONTH'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here are some of the queries I ran on the dataset, and materialized view:&lt;/p&gt;

&lt;p&gt;How many bank closed over the entire time period:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="n"&gt;bf&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which states had the most bank failures over the complete time period?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="n"&gt;bf&lt;/span&gt; 
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;state&lt;/th&gt;
&lt;th&gt;bank_failures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;td&gt;93&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FL&lt;/td&gt;
&lt;td&gt;76&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IL&lt;/td&gt;
&lt;td&gt;69&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MN&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WA&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MO&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AZ&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MI&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TX&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Which states had the most closures over the time period?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;state_failures_by_year&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="n"&gt;bf&lt;/span&gt; 
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;state&lt;/th&gt;
&lt;th&gt;closing_year&lt;/th&gt;
&lt;th&gt;state_failures_by_year&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FL&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;td&gt;2011&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IL&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IL&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FL&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FL&lt;/td&gt;
&lt;td&gt;2011&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For each year, which state(s) had the most bank failures? Include ties.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;ranked_state_failures_by_year&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; 
     &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;state_failures_by_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="c1"&gt;-- Window functions are evaluated AFTER the group by clause, so this is comparing the bank failures per state&lt;/span&gt;
     &lt;span class="n"&gt;dense_rank&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;partition&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt;  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; &lt;span class="n"&gt;bf&lt;/span&gt; 
    &lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
    &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;all_years&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'01-01-2000'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'01-01-2023'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1 year'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;all_years&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'N/A'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state_failures_by_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;state_failures_by_year&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; 
&lt;span class="n"&gt;all_years&lt;/span&gt; &lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt;
&lt;span class="n"&gt;ranked_state_failures_by_year&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt;  &lt;span class="k"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;all_years&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ranked_state_failures_by_year&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_year&lt;/span&gt; 
&lt;span class="c1"&gt;-- Some years do not have bank failures, so we include a null check to ensure they appear in the result set&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;dense_rank&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;dense_rank&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;all_years&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;closing_year&lt;/th&gt;
&lt;th&gt;state&lt;/th&gt;
&lt;th&gt;state_failures_by_year&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;HI&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;IL&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2001&lt;/td&gt;
&lt;td&gt;AR&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2001&lt;/td&gt;
&lt;td&gt;OH&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2001&lt;/td&gt;
&lt;td&gt;IL&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2001&lt;/td&gt;
&lt;td&gt;NH&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2002&lt;/td&gt;
&lt;td&gt;FL&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2003&lt;/td&gt;
&lt;td&gt;WI&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Which years had the most bank failures?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures_by_month&lt;/span&gt; 
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;rank&lt;/th&gt;
&lt;th&gt;closing_year&lt;/th&gt;
&lt;th&gt;total_failures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;157&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2011&lt;/td&gt;
&lt;td&gt;92&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;51&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Which month/year had the most bank failures?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;closing_year&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;closing_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_failures&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures_by_month&lt;/span&gt; 
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;rank&lt;/th&gt;
&lt;th&gt;closing_year&lt;/th&gt;
&lt;th&gt;closing_month&lt;/th&gt;
&lt;th&gt;total_failures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2009&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2010&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What was percent change each year?*&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;bank_failures_by_year&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="c1"&gt;-- Return a table with the year, total_failures, and toal failures in the previous year, defaulting to 0 if not data is found&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bank_failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;coalesce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bank_failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;prev_year_failures&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;generate_series&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2000-01-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2023-12-01'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s1"&gt;'1 year'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;interval&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;series&lt;/span&gt; &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;bank_failures&lt;/span&gt; 
&lt;span class="c1"&gt;-- Join the series of years with the bank failures table based on matching years&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;timescale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'YEAR'&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;closing_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Calculate the year over year change between each year of bank failures&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="k"&gt;case&lt;/span&gt; 
        &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;closing_year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'N/A'&lt;/span&gt;
        &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;prev_year_failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;total_failures&lt;/span&gt;  &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'0%'&lt;/span&gt;
        &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;prev_year_failures&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'N/A'&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(((&lt;/span&gt;&lt;span class="n"&gt;total_failures&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;prev_year_failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;prev_year_failures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;'%'&lt;/span&gt; 
    &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;yoy_pct_change&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bank_failures_by_year&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;closing_year&lt;/th&gt;
&lt;th&gt;total_failures&lt;/th&gt;
&lt;th&gt;yoy_pct_change&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2006.0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2007.0&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2008.0&lt;/td&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;td&gt;733.33%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2009.0&lt;/td&gt;
&lt;td&gt;140&lt;/td&gt;
&lt;td&gt;460.00%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2010.0&lt;/td&gt;
&lt;td&gt;157&lt;/td&gt;
&lt;td&gt;12.14%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2011.0&lt;/td&gt;
&lt;td&gt;92&lt;/td&gt;
&lt;td&gt;-41.40%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2012.0&lt;/td&gt;
&lt;td&gt;51&lt;/td&gt;
&lt;td&gt;-44.57%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2013.0&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;-52.94%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Wrapping it up
&lt;/h2&gt;

&lt;p&gt;For time series data that has gaps, the generate_series() function can enable &lt;br&gt;
analyzing the data with a complete timeline. And, if needed, can support different&lt;br&gt;
levels of granularity (day, week, month, etc.) to match the granularity of your date.&lt;/p&gt;

&lt;h2&gt;
  
  
  Improvements
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;I made use of the EXTRACT function often in my queries. It might make sense to just have columns for month, year since I was using extract so often. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Each row has an acquiring institution. It would be interesting to revisit&lt;br&gt;
the dataset and use some recursive queries to uncover chains of acquisitions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Normalizing the some columns such as state, acquiring institution, institution could make it easier to run queries related to who acquired which banks. FDIC has additional datasets that assign a unique id to institutions. An enhancement to this project would be to import all FDIC institutions and use that as a reference to normalize data for institutions in the bank failures dataset.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Helpful Links
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.fdic.gov/resources/resolutions/bank-failures/" rel="noopener noreferrer"&gt;FDIC Bank Failures Homepage&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.postgresql.org/docs/current/functions-srf.html" rel="noopener noreferrer"&gt;generate_series() postgres docs&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.timescale.com/blog/how-to-create-lots-of-sample-time-series-data-with-postgresql-generate_series/" rel="noopener noreferrer"&gt;timescale db - generate series explanation&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>postgres</category>
      <category>analysis</category>
      <category>timeseries</category>
    </item>
    <item>
      <title>Connecting with your Database with the Information Schema</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Mon, 19 Feb 2024 15:17:30 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/learn-more-about-your-database-with-postgres-information-schema-2pid</link>
      <guid>https://dev.to/itsjjpowell/learn-more-about-your-database-with-postgres-information-schema-2pid</guid>
      <description>&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Intro&lt;/li&gt;
&lt;li&gt;The Information Schema&lt;/li&gt;
&lt;li&gt;System Views&lt;/li&gt;
&lt;li&gt;System Catalog&lt;/li&gt;
&lt;li&gt;Putting It All Together&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;If you've worked with SQL before, you're familiar with writing queries to ask questions about the &lt;em&gt;data&lt;/em&gt; in your database. Give me all the people that user A follows. Tell me how many sales we made in the last week. &lt;/p&gt;

&lt;p&gt;Databases are good for asking questions about your data, but they also store &lt;em&gt;metadata&lt;/em&gt; about the data in your database. PostgreSQL in particular stores data about Table names, schemas, indexes, views and much more. And it's just a few sql queries away from you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Querying Metadata in PostgreSQL
&lt;/h2&gt;

&lt;p&gt;PostgreSQL stores metadata for the database in 3 areas: The Information Schema, System Views, and System Catalogs.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://www.postgresql.org/docs/16/information-schema.html" rel="noopener noreferrer"&gt;The Information Schema&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Postgres' Information Schema "consists of a set of views that contain information about the objects defined in the current database." A database schema is essentially a way to put tables into their own group. By default, any tables you create in Postgres are part of a database schema called "public". &lt;/p&gt;

&lt;p&gt;Though you'll typically query your tables like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also include the schema as part of your query like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;public&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;em&gt;information schema&lt;/em&gt;, as its name suggests, is a schema in your Postgres Database with tables that store data &lt;em&gt;about&lt;/em&gt; the data in your database. You can write queries against the tables in this schema to learn more about your database(s).&lt;/p&gt;

&lt;p&gt;You can find the full list of tables in the information schema &lt;a href="https://www.postgresql.org/docs/16/information-schema.html" rel="noopener noreferrer"&gt;in the Postgres Docs&lt;/a&gt;, but I'm sharing a few that are handy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find all the schemas in your database&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
&lt;span class="k"&gt;distinct&lt;/span&gt; &lt;span class="n"&gt;table_schema&lt;/span&gt; 
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find all the tables in your schema&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; 
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt; 
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;table_schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Information about the columns in a table&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
  &lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;data_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
  &lt;span class="n"&gt;column_default&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;is_nullable&lt;/span&gt; 
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; 
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'sales'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Name of all the constraints for a given table&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; 
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;constraint_column_usage&lt;/span&gt; 
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;table_schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'public'&lt;/span&gt; 
&lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sales'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;a href="https://www.postgresql.org/docs/current/views.html" rel="noopener noreferrer"&gt;System Views&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;Postgres System Views are a collection of views that also have metadata. These tables are listed as system views. All of them are part of the &lt;code&gt;pg_catalog&lt;/code&gt; schema.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find indexes in a schema&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
&lt;span class="n"&gt;tablename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;indexname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;indexdef&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_indexes&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;schemaname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find indexes for a table&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt; 
&lt;span class="n"&gt;tablename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;indexname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;indexdef&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_indexes&lt;/span&gt; 
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;schemaname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;tablename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'mytable'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also list materialized views in your database with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
&lt;span class="n"&gt;schemaname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;matviewname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;ispopulated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;definition&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_matviews&lt;/span&gt; 
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;schemaname&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;a href="https://www.postgresql.org/docs/current/catalogs-overview.html" rel="noopener noreferrer"&gt;System Catalogs&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;"The system catalogs are the place where a relational database management system stores schema metadata, such as information about tables and columns, and internal bookkeeping information."&lt;/p&gt;

&lt;p&gt;Like Systems Views, the system catalog tables are also part of the &lt;code&gt;pg_catalog&lt;/code&gt; schema. This schema, just like the &lt;code&gt;information_schema&lt;/code&gt; exists by default in your Postgres database.&lt;/p&gt;

&lt;p&gt;If I want some estimates about the size of a table, the System Catalog has a table, &lt;a href="https://www.postgresql.org/docs/current/catalog-pg-class.html" rel="noopener noreferrer"&gt;pg_class&lt;/a&gt; that lets me inspect the table size and other attributes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
&lt;span class="n"&gt;relname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;relpages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;reltuples&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_class&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;relname&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'sales'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    relname    | relpages | reltuples 
---------------+----------+-----------
     sales     |        8 |       568

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or, with some extra functions, you can calculate the size of your tables and indexes*:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;relname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 
     &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_class&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;relnamespace&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;regnamespace&lt;/span&gt; 
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;relname&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;*Query from &lt;a href="https://aws.amazon.com/blogs/database/designing-high-performance-time-series-data-tables-on-amazon-rds-for-postgresql/" rel="noopener noreferrer"&gt;Designing high-performance time series data tables on Amazon RDS for PostgreSQL&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   relname    |  bytes  | pg_size_pretty 
--------------+---------+----------------
 sales        | 1638400 | 1600 kB
 sales_id_seq |    8192 | 8192 bytes

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Putting It All Together
&lt;/h3&gt;

&lt;p&gt;Queries can be really powerful when you join tables across these schemas.&lt;br&gt;
For example, you can list all of the constraints of a table. Here's a query&lt;br&gt;
that will give you all of the constraints for a given table in your database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;constraint_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="k"&gt;case&lt;/span&gt; 
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;pgcc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'p'&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'Primary Key Constraint'&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;pgcc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'f'&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'Foreign Key Constraint'&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="n"&gt;pgcc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'c'&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;'Check Constraint'&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;constraint_type&lt;/span&gt;
&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pg_get_constraintdef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pgcc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"oid"&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;constraint_definition&lt;/span&gt; 
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;constraint_column_usage&lt;/span&gt; &lt;span class="n"&gt;iccu&lt;/span&gt;
&lt;span class="k"&gt;inner&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt;  &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_constraint&lt;/span&gt; &lt;span class="n"&gt;pgcc&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;iccu&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;constraint_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pgcc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conname&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'parent'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt; &lt;span class="k"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;column_name&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;       &lt;span class="k"&gt;constraint_name&lt;/span&gt;        &lt;span class="o"&gt;|&lt;/span&gt;    &lt;span class="n"&gt;constraint_type&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt;             &lt;span class="n"&gt;constraint_definition&lt;/span&gt;             
&lt;span class="c1"&gt;------------+-------------+------------------------------+------------------------+-----------------------------------------------&lt;/span&gt;
 &lt;span class="n"&gt;parent&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;parent_pkey&lt;/span&gt;                  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;Primary&lt;/span&gt; &lt;span class="k"&gt;Key&lt;/span&gt; &lt;span class="k"&gt;Constraint&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;parent&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;child_parent_id_fkey&lt;/span&gt;         &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;Foreign&lt;/span&gt; &lt;span class="k"&gt;Key&lt;/span&gt; &lt;span class="k"&gt;Constraint&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
 &lt;span class="n"&gt;parent&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;parent_name&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;name_less_than_20_characters&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;Check&lt;/span&gt; &lt;span class="k"&gt;Constraint&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent_name&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want to find more information about the Information Schema or other Postgres System Tables you can find them here:&lt;br&gt;
&lt;a href="https://www.postgresql.org/docs/16/information-schema.html" rel="noopener noreferrer"&gt;Information Schema&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.postgresql.org/docs/16/catalogs.html" rel="noopener noreferrer"&gt;System Catalogs&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.postgresql.org/docs/16/views.html" rel="noopener noreferrer"&gt;System Views&lt;/a&gt;&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>metadata</category>
      <category>performance</category>
    </item>
    <item>
      <title>Analyzing Retail Sales Data in SQL</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Wed, 24 Jan 2024 01:37:32 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/analyzing-retail-sales-data-in-sql-43f7</link>
      <guid>https://dev.to/itsjjpowell/analyzing-retail-sales-data-in-sql-43f7</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;This year one of my goals was to improve my data analysis skills. As a software engineer, most of time is spent in application level code. Though I work with relational databases, I really don't work with complex queries beyond a few joins. But, data analysis is a common part of our job. If you've ever been on-call, you'v'e probably had to query your database to understand the state of your sysem, who is impacted, when did an issue start? &lt;/p&gt;

&lt;p&gt;To improve my data analysis skills I got a subscription to DataCamp, a popular online course platform focused on data analytics, data engineering, and AI. I've completed a number of courses, and now I'm testing out my skills by doing what's called an Exploratory Data Analysis (EDA).&lt;/p&gt;

&lt;h2&gt;
  
  
  Exploratory Data Analysis
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Table Of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Overview&lt;/li&gt;
&lt;li&gt;Findings&lt;/li&gt;
&lt;li&gt;Exploratory Queries&lt;/li&gt;
&lt;li&gt;Preparing the Data&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Overview
&lt;/h3&gt;

&lt;p&gt;I grabbed this &lt;a href="https://www.kaggle.com/datasets/bravehart101/sample-supermarket-dataset" rel="noopener noreferrer"&gt;Superstore Sales Dataset&lt;/a&gt; from Kaggle as a CSV file.&lt;/p&gt;

&lt;p&gt;It has 13 columns that include sales data for a generic superstore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ship Mode: Mode of shipping used for shipment delivery&lt;/li&gt;
&lt;li&gt;Segment: (Categorical) Customer segment product was shipped to&lt;/li&gt;
&lt;li&gt;Country: Country in which the shipment was delivered&lt;/li&gt;
&lt;li&gt;City: City in which shipment was delivered&lt;/li&gt;
&lt;li&gt;State: State in which the shipment was delivered&lt;/li&gt;
&lt;li&gt;Postal Code: Postal code the shipment was delivered to&lt;/li&gt;
&lt;li&gt;Region: Country region&lt;/li&gt;
&lt;li&gt;Category: The category product belongs to&lt;/li&gt;
&lt;li&gt;Sub-Category: Sub-category of the product&lt;/li&gt;
&lt;li&gt;Sales: Sale made in USD&lt;/li&gt;
&lt;li&gt;Quantity: Product quantity&lt;/li&gt;
&lt;li&gt;Discount: Discount given on the product&lt;/li&gt;
&lt;li&gt;Profit: Profit/loss made on the sale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given this sales data, I asked a few questions to learn more.&lt;br&gt;
I used SQL to parse the CSV and load it into a table I created and I asked a few questions about the data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Findings
&lt;/h2&gt;

&lt;p&gt;Here were some of my findings from querying the dataset.&lt;/p&gt;

&lt;p&gt;Overall profit of the dataset was  $286,397.02 dollars. California, New York, and Washington State contributed most to the overall profits. &lt;/p&gt;

&lt;p&gt;When organized by region the West had the most profit, followed by the East, South, and finally Central regions.&lt;/p&gt;

&lt;p&gt;Though capital cities tended to be the highest profit cities for states, some states had non-capital cities as their highest profit cities. For example, Lebanon, Tennessee was the highest profit city in the state.  &lt;/p&gt;

&lt;p&gt;Technology was the most profitable product category among the three categories (Tech, Furniture, Office Supplies) but the Office Supplies categoryhad the most sales (~6200 sales). &lt;/p&gt;

&lt;p&gt;Though total technology sales were lower, Tech products had the highest profit per unit. Corporate Copiers had the highest profit per unit with ~ $271 dollars of profit per copier. By comparison, the highest profit per unit for Office Supplies was ~$14, for Appliances.&lt;/p&gt;

&lt;p&gt;Across most states (48), products in the Home Office segment had the most sales. Binders and Paper sold the most units of Home Office items with 1,111 binders sold and 1,021 units of Paper sold.&lt;/p&gt;
&lt;h2&gt;
  
  
  Exploratory Queries
&lt;/h2&gt;

&lt;p&gt;How many sales were made in the dataset?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;-- 9994&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How many total items were sold?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;-- 37873&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What was the total profit overall? And, what were the top 10 highest profit states?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;aggregate_profits&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;CASE&lt;/span&gt; 
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Overall'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_profit&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; 
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;GROUPING&lt;/span&gt; &lt;span class="k"&gt;SETS&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;())&lt;/span&gt; 
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_profit&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;aggregate_profits&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_profit&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    country    |   state    | total_profit 
---------------+------------+--------------
 Overall       |            |  286397.0217
 United States | California |   76381.3871
 United States | New York   |   74038.5486
 United States | Washington |   33402.6517
 United States | Michigan   |   24463.1876
 United States | Virginia   |   18597.9504
 United States | Indiana    |   18382.9363
 United States | Georgia    |   16250.0433
 United States | Kentucky   |   11199.6966
 United States | Minnesota  |   10823.1874
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which city had the most profit in each state?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;ranked_profits_by_state_city&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;  
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profit&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; 
&lt;span class="n"&gt;ranked_profits_by_state_city&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        state         |      city       |   profit   
----------------------+-----------------+------------
 Alabama              | Mobile          |  2175.8292
 Arizona              | Glendale        |   182.8598
 Arkansas             | Fayetteville    |  1691.9419
 California           | Los Angeles     | 30440.7579
 Colorado             | Thornton        |   140.8398
 Connecticut          | Fairfield       |  1221.6226
 Delaware             | Newark          |  8086.1715
 District of Columbia | Washington      |  1059.5893
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What were total sales and profit by catagory, and subcategory?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;sub_category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_sales&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;profit&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; 
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;ROLLUP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sub_category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt; &lt;span class="n"&gt;NULLS&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sub_category&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt; &lt;span class="n"&gt;NULLS&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    category     | sub_category | total_sales |   profit    
-----------------+--------------+-------------+-------------
                 |              |        9994 | 286397.0217
 Furniture       |              |        2121 |  18451.2728
 Furniture       | Bookcases    |         228 |  -3472.5560
 Furniture       | Chairs       |         617 |  26590.1663
 Furniture       | Furnishings  |         957 |  13059.1436
 Furniture       | Tables       |         319 | -17725.4811
 Office Supplies |              |        6026 | 122490.8008
 Office Supplies | Appliances   |         466 |  18138.0054
 Office Supplies | Art          |         796 |   6527.7870
 Office Supplies | Binders      |        1523 |  30221.7633
 Office Supplies | Envelopes    |         254 |   6964.1767
 Office Supplies | Fasteners    |         217 |    949.5182
 Office Supplies | Labels       |         364 |   5546.2540
 Office Supplies | Paper        |        1370 |  34053.5693
 Office Supplies | Storage      |         846 |  21278.8264
 Office Supplies | Supplies     |         190 |  -1189.0995
 Technology      |              |        1847 | 145454.9481
 Technology      | Accessories  |         775 |  41936.6357
 Technology      | Copiers      |          68 |  55617.8249
 Technology      | Machines     |         115 |   3384.7569
 Technology      | Phones       |         889 |  44515.7306

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What were sales and profit by region and category?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_sales&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_profit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;ROLLUP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="n"&gt;NULLS&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="n"&gt;NULLS&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; region  |    category     | total_sales | total_profit 
---------+-----------------+-------------+--------------
         |                 |        9994 |  286397.0217
 Central |                 |        2323 |   39706.3625
 Central | Furniture       |         481 |   -2871.0494
 Central | Office Supplies |        1422 |    8879.9799
 Central | Technology      |         420 |   33697.4320
 East    |                 |        2848 |   91522.7800
 East    | Furniture       |         601 |    3046.1658
 East    | Office Supplies |        1712 |   41014.5791
 East    | Technology      |         535 |   47462.0351
 South   |                 |        1620 |   46749.4303
 South   | Furniture       |         332 |    6771.2061
 South   | Office Supplies |         995 |   19986.3928
 South   | Technology      |         293 |   19991.8314
 West    |                 |        3203 |  108418.4489
 West    | Furniture       |         707 |   11504.9503
 West    | Office Supplies |        1897 |   52609.8490
 West    | Technology      |         599 |   44303.6496
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What were sales, and profits by segment, cateory, and subcategory&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; 
        &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;sub_category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_profit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;units_sold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;profit_per_unit&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; 
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sub_category&lt;/span&gt; 
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;profit_per_unit&lt;/span&gt;  &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   segment   |    category     | sub_category | total_profit | units_sold | profit_per_unit 
-------------+-----------------+--------------+--------------+------------+-----------------
 Consumer    | Technology      | Copiers      |     24083.71 |        117 |          205.84
 Consumer    | Technology      | Phones       |     23837.11 |       1685 |           14.15
 Consumer    | Technology      | Accessories  |     20735.92 |       1578 |           13.14
 Consumer    | Furniture       | Chairs       |     13235.33 |       1234 |           10.73
 Consumer    | Technology      | Machines     |      2141.06 |        217 |            9.87
 Consumer    | Office Supplies | Appliances   |      6981.93 |        908 |            7.69
 Consumer    | Office Supplies | Envelopes    |      3264.41 |        442 |            7.39
 Consumer    | Office Supplies | Binders      |     17995.60 |       3015 |            5.97
 Consumer    | Office Supplies | Paper        |     15534.64 |       2602 |            5.97
 Consumer    | Office Supplies | Storage      |      7104.20 |       1619 |            4.39
 Consumer    | Furniture       | Furnishings  |      7919.42 |       1834 |            4.32
 Consumer    | Office Supplies | Labels       |      3075.99 |        715 |            4.30
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What are the most popular product segments by quantity sold?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;popularity&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        state         |   segment   | popularity 
----------------------+-------------+------------
 Alabama              | Home Office |          1
 Alabama              | Consumer    |          2
 Alabama              | Corporate   |          3
 Arizona              | Home Office |          1
 Arizona              | Corporate   |          2
 Arizona              | Consumer    |          3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Preparing the Data
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Using the psql cli, I create the database
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;psql&lt;/span&gt; &lt;span class="c1"&gt;-- Launch Postgres CLI&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;sales_db&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;sales_db&lt;/span&gt; &lt;span class="c1"&gt;-- Connect to the newly created database&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create the database table for the data in the csv
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;ship_mode&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;segment&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;postal_code&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;sub_category&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;quantity&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;discount&lt;/span&gt; &lt;span class="nb"&gt;FLOAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;profit&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Load the csv file intot the table&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Note: I removed the header row in the csv before copying it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ship_mode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;postal_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;sub_category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;discount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;profit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'/path/to/SampleSuperstore.csv'&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DELIMITER&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FORMAT&lt;/span&gt; &lt;span class="n"&gt;csv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command failed because the sales column in the csv file actually a numeric type, not a bigint. I changed the column to a numeric type to allow the data to be imported.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the correct types I could now run the COPY command and populate the&lt;br&gt;
database table.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>data</category>
      <category>csv</category>
      <category>postgres</category>
    </item>
    <item>
      <title>Retrieving Files with the 'fetch' API</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Sun, 24 May 2020 16:34:05 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/retrieving-files-with-the-fetch-api-i5l</link>
      <guid>https://dev.to/itsjjpowell/retrieving-files-with-the-fetch-api-i5l</guid>
      <description>&lt;p&gt;Javascript's &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch" rel="noopener noreferrer"&gt;Fetch API&lt;/a&gt; is usually used to get data from an API. But it can also be used to retrieve files!&lt;/p&gt;

&lt;h1&gt;
  
  
  Fetch a .txt file
&lt;/h1&gt;

&lt;p&gt;The &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Response" rel="noopener noreferrer"&gt;Response&lt;/a&gt; object that is returned from 'fetch', has a few methods that let you retrieve the data returned from the request&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;.json(): returns json&lt;/li&gt;
&lt;li&gt;.text(): returns a string of all the text in the response&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We use the .text() method to get a string of the text from a file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;example.txt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; 
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;textString&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;textString&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The process is identical if we wanted to retrieve a .csv file and do something with the data that's in the file. But we have some extra code to break up the file into rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;example.csv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; 
&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;csvString&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;//Split the csv into rows&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;csvString&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;//Split the row into each of the comma separated values&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at this GitHub repo for example code:&lt;br&gt;
&lt;a href="https://github.com/jpowell96/readFilesWithFetch" rel="noopener noreferrer"&gt;https://github.com/jpowell96/readFilesWithFetch&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>fetch</category>
      <category>files</category>
      <category>api</category>
    </item>
    <item>
      <title>Depth First Search Javascript.</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Tue, 23 Jul 2019 20:59:43 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/depth-first-search-javascript-5c1h</link>
      <guid>https://dev.to/itsjjpowell/depth-first-search-javascript-5c1h</guid>
      <description>&lt;p&gt;Depth-First Search is another one of the algos that can come up on interview questions. Here it is in Javascript. &lt;/p&gt;

&lt;p&gt;The awesome thing about DFS and BFS is that there's just a one line change to go from a DFS to a BFS - and vice versa!&lt;/p&gt;


&lt;div class="ltag__replit"&gt;
  &lt;iframe height="550px" src="https://repl.it/@JonathanPowell2/DepthFirstSearch?lite=true"&gt;&lt;/iframe&gt;
&lt;/div&gt;


</description>
      <category>replit</category>
      <category>node</category>
    </item>
    <item>
      <title>Writing a simple version of Breadth First Search in Javascript</title>
      <dc:creator>Jonathan Powell</dc:creator>
      <pubDate>Tue, 23 Jul 2019 20:57:39 +0000</pubDate>
      <link>https://dev.to/itsjjpowell/writing-a-simple-version-of-breadth-first-search-in-javascript-5493</link>
      <guid>https://dev.to/itsjjpowell/writing-a-simple-version-of-breadth-first-search-in-javascript-5493</guid>
      <description>&lt;p&gt;Breadth First Search is one of those top 10 algorithms to know for an interview. Here it is in javascript. This code works if you're using an adjacency list to represent your graph.&lt;/p&gt;


&lt;div class="ltag__replit"&gt;
  &lt;iframe height="550px" src="https://repl.it/@JonathanPowell2/Breadth-First-Search?lite=true"&gt;&lt;/iframe&gt;
&lt;/div&gt;


</description>
      <category>replit</category>
      <category>node</category>
      <category>datastructures</category>
      <category>graphs</category>
    </item>
  </channel>
</rss>
