Postgres Is Underrated—It Handles More than You Think

Jason Skowronski on October 09, 2019

Thinking about scaling beyond your Postgres cluster and adding another data store like Redis or Elasticsearch? Before adopting a more complex infra... [Read Full]
markdown guide
 

Postgres 12 finally added generated columns, which also come in handy for fulltext search (no more triggers for updating tsvector columns):

pgdash.io/blog/postgres-12-generat...

 

Hi, this is a really nice article. The funny thing is, it comes at the right time. We use Postgres and are currently considering to add Redis behind our rust backend with websockets to enable push notifications for turtle (turtle.community), but are not quite sure if this is the right way.

Does anyone have some inspiration or experience on that topic?

 

I'm not sure if it fits your use case perfectly, but take a look at NOTIFY, introduced in PG 9.0 -- it's basically a push notification mechanism built into PostgreSQL.

 
 
 

Hey Jason great post! Quick thing on the full text example you have: if you use plianto_tsquery it will split the words for you and add the AND. If you use phraseto_tsquery it will apply a positional argument (<->) instead of the &, which is great if you're looking for a name or place. The new websearch_to_tsquery (in PG 11) is a great general purpose query builder as well.

 

Great write up Jason. I wasn't aware of PL/Python until now. Definitely a big fun of the JSON data types and have never used hstore but, thank you for sharing the use case.

There is also Postgres-XL that seems to offer a lot and only 5 years old compared to 3 decade-old PostgreSQL.

 

I'm loving your articles Jason, great job again !
It's hard to find in-depth and super high-quality articles like these in this day and age :)

 

This is an amazing complement! Mind if I use this quote on my site?

 

Of course! My pleasure :)

Anytime, keep up the great work!

 

Great article Jason! It's really important to limit the number of technologies used in a project and not follow trends blindly! This is quite a similar analogy to the microservice vs monolith debate.

 

I wasn't aware of tsquery, thanks for pointing that out! - I'm working on a project right now where full on Elasticsearch is probably overkill, but needed something more robust than just doing LIKE

 

100% especially on search. Search in Postgres is very underrated.

 

I think another really important PG plugin to add is TimeScaleDB -- the ability to add efficient time series data collection/query to Postgres is awesome.

 

This is a great article!

You mention “Use multi-column indexes sparingly“, which is generally good advice, however I would qualify that by saying avoid redundant indexes. It’s important to know that multicolumn (btree) indexes have a specific ordering to the columns. If you don’t filter or join a table using one or more of the columns in that order then it can’t use the index.

For example an index on columns a,b. If you only filter on column b then this index can’t be used. It can be used if you filter by just a, or both a and b. Following on from that, if you frequently filter by both a and b then this is a good index to have. If you also have both a single column index on a, and a multicolumn index on a,b then in that specific case the single column index is the redundant one.

 

Great article, I use postgres on heroku for database stuff all the time, I love it and I recommend it all the time.

 

Thanks for the post, I learned a lot of new things! I'd also like to mention that OLAP style queries can be rough with lots of data.

 

I am also thinking about this. RDBMS generally are good at most types of workloads but mixed workloads like OLTP and OLAP on the same system will interfere with each other.

How do people do embedded reporting AND transactions in modern web apps?

NoSQL solutions like ElasticSearch are mentioned but they seem more appropriate for search. Data warehouse solutions like Snowflake, BigQuery, Redshift are good for internal analytics and reporting but they just don’t have the concurrency to support direct queries from customer facing apps.

How else to do this without complex data pipelines or complex infrastructure involving Kafka?

 

I do it with a complex data pipeline and infrastructure involving Kafka!

I've got my eye on Apache Druid, though I haven't spent any real time (sorry, pun totally intended) with it.

I’ve looked at Druid, also considering MemSQL , ClickHouse and others

 

Thanks for this post,
I wrote about using postgres to setup a distributed database. Here is the link for those who are interested dev.to/sh1ftsh/setting-up-distribu...

 

Yyyyeah... You love PG... Until you need to do a simple dumb forking count on it. That's where PG sucks big time.

 

Is “forking” count some special operation or just a nice word in place of swearing?

Regarding (exact) count, pretty much every system has trouble doing this quickly and on most cases you really don’t need an exact count. This is especially true for medium to large data.

Table/index statistics that are kept up to date will give a good approximation.

HyperLogLog was one option mentioned which will give good-enough approximations, another approach is log-normal histograms. I don’t think this challenge is unique to PG.

 

Yes, it's a replacement word, and a reference to "The Good Place" TV series.

And regarding count, no system suffer as much from it as PG. Approximations are rarely an option, and while this challenge is definitely not unique to PG, the PG is doing it worst of all DBMS. Except, maybe, sqlite, and even that I'm not sure.
Such a simple functionality, but can't be implemented properly for more than a dozen of years already. PG's count can be 15x times slower than that of MySQL and 70x slower than Oracle or SQLServer. And that sheet has been there for ages. And is till there.

 

Wow! An awesome review of Postgres important features.
I can't stress out more how devs sometimes just follow the hype without thinking. When someone suggests a NoSQL data store, I always ask her/him to convince me why SQL doesn't work here.

 

Don't forget Materialized Views: can be an awesome way to optimize searches on things that you might be tempted to dump into some NoSQL text-based search tool.

 

Where have I read this article before? Feels like either this is a clone or the other is a clone. Brb...

 

Found it, it's the same guy. Fair play!
infoq.com/articles/postgres-handle...

Thanks for the other one, it was very useful.

 

Maybe InfoQ or Hackernoon? I posted it there as well :)

 
 

More in-depth articles of this kind will make this old code bum happy ;)

 

Just want point out one error (or misunderstanding), B-Tree is not binary as stated in "B-tree indexes B-tree indexes are binary trees"

 
 
 

Nice article!

Nit - there is one redundant "are" in "Extensions are to Postgres are"

 
 

Nice article ! Thank you.

I think there'e a typo in the check strings length example (b- vs b=).

 
 
code of conduct - report abuse