Double bind the foreign key to avoid unnecessary JOIN in TypeORM

#typeorm #typescript #database

ORM tools are cool but they generate SQL "automagically". The generated code tends to be unnecessarily complex and causes a performance overhead.

In a database, performance depends on the time to compute the information + the time to bring this information back to the server.

In this article, I will show how to reduce the computation time by removing some unnecessary JOINs and how to load only the information you need.

The model

In this article, we will use the simplest possible model :

A table called company
A table called employee

mapped like this :

@Entity({ name: "company" })
export class CompanyEntity extends BaseEntity {

  @PrimaryGeneratedColumn({name: "id"})
  id: string;

  @Column({ name: "creation_date"})
  creationDate: Date;

  @Column({ name: "name"})
  name: string;
}

@Entity({ name: "employee"})
export class EmployeeEntity extends BaseEntity {

  @PrimaryGeneratedColumn()
  id: string;

  @Column({ name: "creation_date" })
  creationDate: Date;

  @Column({ name: "first_name" })
  firstName: string;

  @Column({ name: "last_name" })
  lastName: string;

  @JoinColumn({name: "company_id"})
  @ManyToOne(() => CompanyEntity)
  company: CompanyEntity;
}

Preparation : Activate logging on the local environment

To spot optimization problems early, you should always log the generated SQL at least in your local environment. You can do so using the logging option from the TypeORM configuration

  createTypeOrmOptions(): TypeOrmModuleOptions {
    return {
      type: "postgres",
      host: "localhost",
      port: 5432,
      database: "acme",
      username: "acme",
      password: "acme",
      schema: "acme",
      logging: true, // <= Here !
      entities: [
        CompanyEntity,
        EmployeeEntity
      ],
      synchronize: false
    }
  }

In this way, TypeORM will output the generated SQL in logs, which is a huge help for debugging. Here is an example of output :

query: SELECT "EmployeeEntity"."id" AS "EmployeeEntity_id", "EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date", "EmployeeEntity"."first_name" AS "EmployeeEntity_first_name", "EmployeeEntity"."last_name" AS "EmployeeEntity_last_name", "EmployeeEntity"."company_id" AS "EmployeeEntity_company_id" FROM "acme"."employee" "EmployeeEntity"

In this article, we will rely on that to show the code generated by TypeORM

Double bind the foreign key

Suppose that your application has already loaded the list of companies and you want to get the list of employees for one specific company, knowing it's companyId

The naive implementation would be this :

findByCompanyId(companyId: string) {
  return this.employeeRepository.findBy({
    company: {
      id: companyId
    }
  })
}

But let's have a look at the generated SQL :

SELECT "EmployeeEntity"."id"            AS "EmployeeEntity_id",
       "EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
       "EmployeeEntity"."first_name"    AS "EmployeeEntity_first_name",
       "EmployeeEntity"."last_name"     AS "EmployeeEntity_last_name",
       "EmployeeEntity"."company_id"    AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"
  LEFT JOIN "acme"."company" "EmployeeEntity__EmployeeEntity_company"
    ON "EmployeeEntity__EmployeeEntity_company"."id" = "EmployeeEntity"."company_id"
WHERE (((("EmployeeEntity__EmployeeEntity_company"."id" = $1)))

There is a JOIN between the employee and company tables! This join is not necessary as the "company id" information is already present in the employee table.

And here comes the double binding on the foreign key to solve that :

@Entity({ name: "employee"})
export class EmployeeEntity extends BaseEntity {

  @PrimaryGeneratedColumn()
  id: string;

  // [...]

  @JoinColumn({name: "company_id"})
  @ManyToOne(() => CompanyEntity)
  company: CompanyEntity;

  // Here is where the magic happens
  @Column({name: "company_id"})
  companyId: string;
}

We let the company relationship but add a companyId field, which points to the same column.

Now we can rewrite the findByCompany implementation like this :

findByCompanyId(companyId: string) {
  return this.employeeRepository.findBy({
    // Use the companyId "column" instead of the relationship
    companyId: companyId
  })
}

Now, the generated SQL looks like this :

SELECT "EmployeeEntity"."id"            AS "EmployeeEntity_id",
       "EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
       "EmployeeEntity"."first_name"    AS "EmployeeEntity_first_name",
       "EmployeeEntity"."last_name"     AS "EmployeeEntity_last_name",
       "EmployeeEntity"."company_id"    AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"
WHERE (("EmployeeEntity"."company_id" = $1))

No more unnecessary JOIN !

Bonus: "SELECT" only the information you need

In this example, the table employee contains few columns but, in a real-world example, it could contain dozens of columns that you don't need every time. Especially columns that could contain sensible information

Suppose that I just fetch all employees like this:

this.employeeRepository.find();

Here is the generated sql :

SELECT "EmployeeEntity"."id"            AS "EmployeeEntity_id",
       "EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
       "EmployeeEntity"."first_name"    AS "EmployeeEntity_first_name",
       "EmployeeEntity"."last_name"     AS "EmployeeEntity_last_name",
       "EmployeeEntity"."company_id"    AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"

But we might not need all the columns. TypeORM lets you decide which columns you want to fetch, like this :

this.employeeRepository.find({
  select: {
    firstName: true,
    lastName: true
  }
});

Note : we don't call findBy anymore but find which allows us to pass other useful parameters such as take, skip and relations (this one is for lazy/eager loading), and of course, the where clause.

And here is the corresponding SQL :

SELECT "EmployeeEntity"."first_name" AS "EmployeeEntity_first_name",
       "EmployeeEntity"."last_name"  AS "EmployeeEntity_last_name",
       "EmployeeEntity"."id"         AS "EmployeeEntity_id"
FROM "acme"."employee" "EmployeeEntity"

Note : the return type will still be a EmployeeEntity[] but the non-required fields are left undefined

Conclusion

Double binding on the foreign key is so useful that I now use it on nearly every foreign key.

Also, select can be useful on main tables, especially the user table, which often contains sensible information that you don't want to expose outside of the database if not necessary.

I hope that those tips will help you reduce the performance overhead on your database!