DEV Community

Cover image for Howto handle index key specification without a key length and key was too long
AndreasHeissenberger
AndreasHeissenberger

Posted on • Originally published at heissenberger.at

Howto handle index key specification without a key length and key was too long

You create your datastructure and define a table which will allow UPSERT (insert if document not exists) and get one of this errors:

1170 BLOB/TEXT column 'document_id' used in key specification without a key length

Specified key was too long; max key length is 3072 bytes

The Problem

When creating an index there is an impelementation specific limit on the maximum number of bytes allowd per key.
It is bytes and not characters as it depends on the defined charset of the collation how many bytes are needed per character.

Here is a sample table which is uses the columns collection and document_id for the primary key:

CREATE TABLE type_varchar (
  `collection` TEXT COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_id` varchar(255) COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_type` longtext COLLATE utf8mb3_unicode_ci NOT NULL,
  `document` longtext COLLATE utf8mb3_unicode_ci NOT NULL COMMENT '(DC2Type:json)',
  PRIMARY KEY (collection,document_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci;
Enter fullscreen mode Exit fullscreen mode

The result of this will be this error:

SQLSTATE[42000]: Syntax error or access violation: 1170 BLOB/TEXT column 'document_id' used in key specification without a key length

Some people or sources on the web will tell you that you should not use BLOB/TEXT for the index and suggest to use VARCHAR:

CREATE TABLE type_varchar (
  `collection` varchar(800) COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_id` varchar(255) COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_type` longtext COLLATE utf8mb3_unicode_ci NOT NULL,
  `document` longtext COLLATE utf8mb3_unicode_ci NOT NULL COMMENT '(DC2Type:json)',
  PRIMARY KEY (collection,document_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci;
Enter fullscreen mode Exit fullscreen mode

The result of this will be this error:

SQLSTATE[42000]: Syntax error or access violation: Specified key was too long; max key length is 3072 bytes

How To Fix?

Fixing both errors only requires to add a length value of (768) to the column specified in the primary key:

PRIMARY KEY (collection(768),document_id)

But how do you calulate the value for the length?

  1. get your collection charset COLLATE utf8mb3_unicode_ci
  2. lookup the bytes for the charaterset utf8mb3 or the alias utf8

utf8mb3: 3 bytes

Based on the 2. error, a maximum of 3072 bytes is allowed per column by the database engine (MySQL / MariaDB):
3072 / 3 = 1024

On my MariaDB 10.8.3 instance this fails as it looks like that the index is using utf8mb4 a 4 byte collation 3072 / 4 = 768 which works!

Difference between TEXT and VARCHAR

On a first look both support up to 65,535 bytes and only need the space used by real data but MySQL imposes a limit of 65,535 bytes for the max size of each row.
The size of a VARCHAR column is counted towards the maximum row size, while TEXT columns are assumed to be storing their data by reference so they only need 9-12 bytes.
That means even if the "theoretical" max size of your VARCHAR field is 65,535 characters you won't be able to achieve that if you have more than one column in your table.

TEXT

  • fixed maximum of 65,535 bytes (no option to limit the size)
  • needs 2+c bytes on disk, where c is the length of the bytes used by the string based on the charset
  • no default value
  • tempory tables will use a disk based table (MEMORY table does not support TEXT/BLOB)
  • requires a defined maximum length of the part which becomes a key

VARCHAR(N)

  • variable maximum size of N bytes (charaters depend o)
  • N is between 1 and 65,535 - Performance implications / Stackexchange
  • takes 1 + c bytes (for N ≤ 255) or 2 + c (for 256 ≤ N ≤ 65535) bytes of disk space where c is the length of the stored string
  • part of a index works exept for N > max key length which reqires to manually set the length of the key

Some of the implementation details depend on the engine e.g. InnoDB or MyISAM.

Real content using utf8mb4 will use 2, 3 or 4 byted depending on the character!

Fail Primary/Unique Keys with limited length?

Does a shortend index key still provide a unique key to fullfill all requirements for a primary/unique key to not allow dupblicates?

Test with a index key with length 1 - collection(1):

CREATE TABLE type_varchar (
  `collection` varchar(800) COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_id` varchar(255) COLLATE utf8mb3_unicode_ci NOT NULL,
  `document_type` longtext COLLATE utf8mb3_unicode_ci NOT NULL,
  `document` longtext COLLATE utf8mb3_unicode_ci NOT NULL COMMENT '(DC2Type:json)',
  PRIMARY KEY (collection(1),document_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci;
Enter fullscreen mode Exit fullscreen mode

insert two times the same content:

INSERT INTO `type_varchar` (`collection`, `document_id`, `document_type`, `document`)
VALUES
    (REPEAT('1',800), '1', 't1', 'doc1');
INSERT INTO `type_varchar` (`collection`, `document_id`, `document_type`, `document`)
VALUES
    (REPEAT('1',800), '1', 't1', 'doc1');
Enter fullscreen mode Exit fullscreen mode

the result is the expected error:

Duplicate entry '1111111111111111111111111111111111111111111111111111111111111...' for key 'PRIMARY'

PRIMARY KEY or UNIQUE KEY with limited length are valid unique indexes!

What are the implications:

  • performance will suffer as there will be a need to scan the table
  • size of index will be very small

Top comments (0)