DEV Community

Dmitry Romanoff
Dmitry Romanoff

Posted on

Automate Foreign Key Index Checks in PostgreSQL: A Practical Bash Script

Image description

In the realm of database management, ensuring that foreign keys (FKs) are properly indexed is crucial for maintaining optimal query performance. Indexes on foreign keys can significantly speed up query operations and enforce referential integrity efficiently. However, sometimes we overlook this aspect, leading to potential performance issues.

In this article, we’ll explore a Bash script designed to address this issue by checking for foreign keys without indexes on their source columns in PostgreSQL databases. The script, check_fk_without_index_on_src_columns.sh, automates the process of identifying such foreign keys, making it easier for database administrators to maintain and optimize their database systems.

Script Overview

The check_fk_without_index_on_src_columns.sh script is a robust tool for checking PostgreSQL databases for foreign keys that lack indexes on their source columns. It generates a detailed report on the missing indexes, which helps in identifying areas for performance improvement. Here's a breakdown of how this script works and how you can use it.

Key Features

  1. The script uses command-line arguments to specify database connection details and report settings. It accepts parameters for hostname, port, database pattern, offset, and limit.
  2. It generates a timestamped report that details the foreign keys without indexes on their source columns. The report is saved in a designated directory.
  3. The script connects to the PostgreSQL server and processes databases that match the specified pattern. It iterates over each database, runs queries to identify missing indexes, and includes the results in the report.
  4. The script outputs results both to the terminal and to a file, ensuring that the report is available for future reference.

Usage Instructions

To use this script, you’ll need to provide several parameters:

-h for PostgreSQL hostname
-p for PostgreSQL port
-x for database pattern
-a for report offset (starting from 0)
-b for report limit

For example:

./check_fk_without_index_on_src_columns.sh -h localhost -p 5432 -x '%mydb%' -a 0 -b 10
Enter fullscreen mode Exit fullscreen mode

This command will check databases that match the %mydb% pattern, starting from the offset 0 and limiting the results to 10 databases.

The Script in Action

Here’s the full script for checking foreign keys without indexes:

#!/bin/bash

##################################################################################################################################
#
# Name: check_fk_without_index_on_src_columns.sh
#
# Description: Check Postgres FKs with missing indexes on source
#
# Author: Dmitry
#
# Date: 01-Jan-2020
#
# Usage Example:
#
#     ./check_fk_without_index_on_src_columns.sh -h localhost -p port -x db_pattern -a start_position -b end_position
#
####################################################################################################################################

helpFunction()
{
   echo ""
   echo "Usage: $0 -h hostname -p port -x db_pattern -a offset -b limit"
   echo -e "\t-h Postgres hostname"
   echo -e "\t-p Postgers port"
   echo -e "\t-x Postgres db pattern"
   echo -e "\t-a Report offset starting from 0"
   echo -e "\t-b Report limit"
   echo -e " "
   echo -e "Example how to run: $0 -h localhost -p 15459 -x % -a 0 -b 10 "
   echo -e " ** during run the output report is generated automatically"
   echo -e " ** the output report is generated in the directory output_dir"
   echo -e " "
   exit 1 # Exit script after printing help
}

while getopts "h:p:x:a:b:" opt
do
   case "$opt" in
      h ) inpHost="$OPTARG" ;;
      p ) inpPort="$OPTARG" ;;
      x ) inpDBPattern="$OPTARG" ;;
      a ) inpOffset="$OPTARG" ;;
      b ) inpLimit="$OPTARG" ;;
      ? ) helpFunction ;; # Print helpFunction in case parameter is non-existent
   esac
done

# Print helpFunction in case parameters are empty
if [ -z "$inpHost" ] || [ -z "$inpPort" ] || [ -z "$inpDBPattern" ] || [ -z "$inpOffset" ] || [ -z "$inpLimit" ] 
then
   echo "Some or all of the parameters are empty";
   helpFunction
fi

# Begin script in case all parameters are correct

db_sdm_name=`sdm status | grep $inpPort | awk ' { print $1 }'`
current_date_time="$(date +'%Y%m%d_%H%M%S')" 

report_name="Check postgres FKs with missing index on source column(s)"

output_dir="./output_dir"

output_trace_name="report_check_pg_fk_without_indexes_${current_date_time}_port_${inpPort}_dbinstance_${db_sdm_name}.trc"

output_full_name="${output_dir}/${output_trace_name}"

touch ${output_full_name}

echo "------------------------------------------------------------------" 
echo "Report: ${report_name} " 
echo "------------------------------------------------------------------" 
echo "Postgres DB Instance: [${db_sdm_name}] port [${inpPort}]" 
echo "$report_name" 
echo "Timestamp: [${current_date_time}]" 
echo "------------------------------------------------------------------" 

echo " " 
echo "Input parameters:" 
echo "---------------- " 
echo "inpHost=$inpHost" 
echo "inpPort=$inpPort" 
echo "inpDBPattern=$inpDBPattern"
echo "inpOffset=$inpOffset"
echo "inpLimit=$inpLimit"
echo "---------------- "

echo "------------------------------------------------------------------" >> $output_full_name
echo "Report: ${report_name} " >> $output_full_name
echo "------------------------------------------------------------------" >> $output_full_name
echo "Postgres DB Instance: [${db_sdm_name}] port [${inpPort}]" >> $output_full_name
echo "$report_name"  >> $output_full_name
echo "Timestamp: [${current_date_time}]" >> $output_full_name
echo "------------------------------------------------------------------" >> $output_full_name

echo " " >> $output_full_name
echo "Input parameters:" >> $output_full_name
echo "---------------- " >> $output_full_name
echo "inpHost=$inpHost" >> $output_full_name
echo "inpPort=$inpPort" >> $output_full_name
echo "inpDBPattern=$inpDBPattern" >> $output_full_name
echo "inpOffset=$inpOffset" >> $output_full_name
echo "inpLimit=$inpLimit" >> $output_full_name
echo "---------------- " >> $output_full_name

num_of_dbs_to_process=$(psql -h $inpHost -p $inpPort -t -c "select count(M.*) from (select datname from pg_database where datname like '${inpDBPattern}' order by 1 offset ${inpOffset} limit ${inpLimit}) M " postgres)

echo " "
echo "------------------------------------------------------------------"
echo " Number of databases: ${num_of_dbs_to_process}"
echo "------------------------------------------------------------------"

echo " " >> $output_full_name
echo "------------------------------------------------------------------" >> $output_full_name
echo " Number of databases: ${num_of_dbs_to_process}" >> $output_full_name
echo "------------------------------------------------------------------" >> $output_full_name

idy=0

idx=${inpOffset}

idx=$(($idx + 1))

for DBs_to_process in $(psql -h $inpHost -p $inpPort -t -c "select M.* from (select datname from pg_database where datname like '${inpDBPattern}' order by 1 offset ${inpOffset} limit ${inpLimit} ) M " postgres); do
echo "#${idx}:"
echo "DB:${DBs_to_process}" 

echo "#${idx}:" >> $output_full_name
echo "DB:${DBs_to_process}" >> $output_full_name

n_of_fks_without_indexes_on_src_columns="1"

n_of_fks_without_indexes_on_src_columns=$(psql -h localhost -p $inpPort -U postgres -d $DBs_to_process -t << EOF
select count(1)
from
(
SELECT c.conrelid::regclass AS "table",
       string_agg(a.attname, ',' ORDER BY x.n) AS columns,
       pg_catalog.pg_size_pretty(
          pg_catalog.pg_relation_size(c.conrelid)
       ) AS size,
       c.conname AS constraint, 
       c.confrelid::regclass AS referenced_table
FROM pg_catalog.pg_constraint c
   /* enumerated key column numbers per foreign key */
   CROSS JOIN LATERAL
      unnest(c.conkey) WITH ORDINALITY AS x(attnum, n)
   /* name for each key column */
   JOIN pg_catalog.pg_attribute a
      ON a.attnum = x.attnum
         AND a.attrelid = c.conrelid
WHERE NOT EXISTS
        (SELECT 1 FROM pg_catalog.pg_index i
         WHERE i.indrelid = c.conrelid 
           AND i.indpred IS NULL
           AND (i.indkey::smallint[])[0:cardinality(c.conkey)-1]
               OPERATOR(pg_catalog.@>) c.conkey)
  AND c.contype = 'f'
GROUP BY c.conrelid, c.conname, c.confrelid
ORDER BY pg_catalog.pg_relation_size(c.conrelid) DESC
) M;
EOF)

if [ "$n_of_fks_without_indexes_on_src_columns" -ne "0" ]; then

psql -h localhost -p $inpPort -U postgres -d $DBs_to_process << EOF
SELECT c.conrelid::regclass AS "table",
       /* list of key column names in order */
       string_agg(a.attname, ',' ORDER BY x.n) AS columns,
       pg_catalog.pg_size_pretty(
          pg_catalog.pg_relation_size(c.conrelid)
       ) AS size,
       c.conname AS constraint,
       c.confrelid::regclass AS referenced_table
FROM pg_catalog.pg_constraint c
   /* enumerated key column numbers per foreign key */
   CROSS JOIN LATERAL
      unnest(c.conkey) WITH ORDINALITY AS x(attnum, n)
   /* name for each key column */
   JOIN pg_catalog.pg_attribute a
      ON a.attnum = x.attnum
         AND a.attrelid = c.conrelid
WHERE NOT EXISTS
        /* is there a matching index for the constraint? */
        (SELECT 1 FROM pg_catalog.pg_index i
         WHERE i.indrelid = c.conrelid
           /* it must not be a partial index */
           AND i.indpred IS NULL
           /* the first index columns must be the same as the
              key columns, but order doesn't matter */
           AND (i.indkey::smallint[])[0:cardinality(c.conkey)-1]
               OPERATOR(pg_catalog.@>) c.conkey)
  AND c.contype = 'f'
GROUP BY c.conrelid, c.conname, c.confrelid
ORDER BY pg_catalog.pg_relation_size(c.conrelid) DESC;
EOF

psql -h localhost -p $inpPort -U postgres -d $DBs_to_process << EOF >> $output_full_name
SELECT c.conrelid::regclass AS "table",
       /* list of key column names in order */
       string_agg(a.attname, ',' ORDER BY x.n) AS columns,
       pg_catalog.pg_size_pretty(
          pg_catalog.pg_relation_size(c.conrelid)
       ) AS size,
       c.conname AS constraint,
       c.confrelid::regclass AS referenced_table
FROM pg_catalog.pg_constraint c
   /* enumerated key column numbers per foreign key */
   CROSS JOIN LATERAL
      unnest(c.conkey) WITH ORDINALITY AS x(attnum, n)
   /* name for each key column */
   JOIN pg_catalog.pg_attribute a
      ON a.attnum = x.attnum
         AND a.attrelid = c.conrelid
WHERE NOT EXISTS
        /* is there a matching index for the constraint? */
        (SELECT 1 FROM pg_catalog.pg_index i
         WHERE i.indrelid = c.conrelid
           /* it must not be a partial index */
           AND i.indpred IS NULL
           /* the first index columns must be the same as the
              key columns, but order doesn't matter */
           AND (i.indkey::smallint[])[0:cardinality(c.conkey)-1]
               OPERATOR(pg_catalog.@>) c.conkey)
  AND c.contype = 'f'
GROUP BY c.conrelid, c.conname, c.confrelid
ORDER BY pg_catalog.pg_relation_size(c.conrelid) DESC;
EOF

idy=$(($idy + 1))

fi

echo "................................................................."

idx=$(($idx + 1))

done

echo "Total Fks with Missing Indexes On Source Columns: $idy"
echo "Total Fks with Missing Indexes On Source Columns: $idy" >> $output_full_name

echo " "
echo "                      --- The End ---                            " 
echo " " 

echo " " >> $output_full_name
echo "                      --- The End ---                            " >> $output_full_name
echo " " >> $output_full_name

echo " " 
Enter fullscreen mode Exit fullscreen mode

Top comments (0)