Discussion on: What I learned from only using Select in PostgreSQL

dmfay profile image
Dian Fay

Couple tips :)

  • Views are just stored queries that get run when you invoke the view. The space and power consumption is negligible unless you create a materialized view which actually stores its results (and therefore has to be manually refreshed).
  • GROUP BY is only necessary if you're actually doing aggregate calculations. If you're just pulling individual records without counting, summing, or otherwise deriving a value from multiple rows, you don't even need to specify it. If you just have duplicate rows you want to filter out, you can SELECT DISTINCT instead.

More generally: relational databases are the Swiss Army knife of data storage. The real question is which use cases they aren't suited for -- and that's where you'll find some other model being used. RDBMSs are overkill for key:value storage, which is why Redis exists; they can't handle planet-scale flat data, which is why you have HBase, Cassandra, and so on; and they're traditionally terrible with hierarchical data, which is how we got MongoDB (although Postgres' JSON support is extremely powerful & lets you blend relational and document models quite effectively these days).

I've used RDBMSs in general for web and desktop applications across my career, and Postgres in particular mostly for web. I maintain a moderately popular data mapper for Postgres and Node, so I get some framework-level development in for fun too.

jenc profile image
jen chan Author

re: Materialized view. Maybe this is what my manager meant.
I could see if you had a huuuggggee database maybe it's not worth the time searching through everything to create a view.
They had said it wasn't worth creating a view unless I were to use it all the time or it would slow everything down.

I suppose I was compelled to use group by as I'm a visual person and seeing all similarities in a table is more satisfying and easy to comprehend than individual records.

Thank you for your comprehensive response by the way. I've not heard v much about Hbase or Cassandra.

dmfay profile image
Dian Fay

Think of views as a way to cut down on copy+paste. There's no point in creating them for one-off queries, but if you have a few:

SELECT c.name, o.id, p.name
FROM customers c
JOIN orders o ON o.customer_id = c.id
JOIN order_products op ON op.order_id = o.id
JOIN products p ON p.id = op.product_id;
SELECT p.id, p.name, COUNT(DISTINCT o.country) AS countries_shipped_to
FROM products p
JOIN order_products op ON op.product_id = p.id
JOIN orders o ON o.id = op.order_id
GROUP BY p.id, p.name
ORDER BY countries_shipped_to DESC;

Both of these queries use the same orders-order_products-products information. They do it in a different order since the first is starting from customers and the second from products itself, but it's the same data. Instead of joining those three tables, you could create a view with the query:

SELECT o.*, p.*
FROM orders o
JOIN order_products op ON op.order_id = o.id
JOIN products p ON p.id = op.product_id;

Then you can use the view in your queries and suddenly they're a good bit simpler -- the complexity of the order-to-product relationship is encapsulated in the view, so your queries don't have to manage it themselves.