DEV Community

Cover image for How to find useless foreign keys in a database
Andrej Kirejeŭ
Andrej Kirejeŭ

Posted on

How to find useless foreign keys in a database

With times, old databases tend to overgrow with unnecessary or just empty fields. Whilst regular fields are not much of a concern, those with foreign key constraints, especially in tables with millions of records, would needlessly inflate the database file and put performance penalty on every insert/update/delete operation, as appropriate index need to be updated.

For example, an index on a field that holds nothing more than NULL values over a table with 100 000 000 records has a size approximately of 600 MB (true for Firebird 3 database file structure).

The query below helps to spot fields with foreign keys, which contain no more than a given count of unique values (by default, the variable maxuniqvalues is set to 1 to find all fields which are empty or contain exactly one value). Additionally, the condition on minimal record count in a table in question could be set through variable minreccnt.

The result data set includes the following columns: name of the relation, number of records in the relation, name of the field, value (only the first value is shown), and a list of objects dependent on the field. The latter will help a lot if the field to be deleted later.

Though the code is written in Firebird SQL flavor, it could be easily adapted to any modern SQL server.

EXECUTE BLOCK
  RETURNS(
    rn VARCHAR(31),
    reccnt INTEGER,
    fkfieldname VARCHAR(31),
    val INTEGER,
    dependent VARCHAR(8192)
  )
AS
  DECLARE VARIABLE minreccnt DOUBLE PRECISION = 1000000;
  DECLARE VARIABLE maxuniqvalues DOUBLE PRECISION = 1;
BEGIN
  FOR
    SELECT
      rc.rdb$relation_name,
      CAST((1 / idx.rdb$statistics) AS INTEGER),
      idxsfk.rdb$field_name,
      (SELECT
         LIST(TRIM(d.rdb$dependent_name))
         FROM rdb$dependencies d
         WHERE 
           d.rdb$depended_on_name = rc.rdb$relation_name
           AND 
           d.rdb$field_name = idxsfk.rdb$field_name)
    FROM
      rdb$relation_constraints rc
      JOIN rdb$indices idx
        ON idx.rdb$index_name = rc.rdb$index_name
      JOIN rdb$index_segments idxs
        ON idxs.rdb$index_name = idx.rdb$index_name
        AND idxs.rdb$field_position = 0
      JOIN rdb$relation_constraints rcfk
        ON rcfk.rdb$constraint_type = 'FOREIGN KEY'
        AND rcfk.rdb$relation_name = rc.rdb$relation_name
      JOIN rdb$indices idxfk
        ON idxfk.rdb$index_name = rcfk.rdb$index_name
      JOIN rdb$index_segments idxsfk
        ON idxsfk.rdb$index_name = idxfk.rdb$index_name
        AND idxsfk.rdb$field_position = 0
    WHERE
      rc.rdb$constraint_type = 'PRIMARY KEY'
      AND
      (idx.rdb$statistics > 0 
        AND idx.rdb$statistics < (1.0 / :minreccnt))
      AND
      idxfk.rdb$statistics >= (1.0 / :maxuniqvalues)
    ORDER BY
      idx.rdb$statistics ASC
    INTO
      :rn, :reccnt, :fkfieldname, :dependent
  DO BEGIN
    EXECUTE STATEMENT 
      'SELECT FIRST 1 ' || 
      :fkfieldname || 
      ' FROM ' || 
      :rn
      INTO :val;
    SUSPEND;
  END
END
Enter fullscreen mode Exit fullscreen mode

Top comments (1)

Collapse
 
stanislau profile image
Stanislau Shliakhtsich

Well done!