The n + 1 problem simply means too many database calls. It shows that for every n call made to the database, there exist an n + 1 call because each call being made might need to make further calls to resolve database relationship between entities. This is apparent in Graphql; as you nest queries and mutation deeply, you will run into a performance bootleneck because each nested field needs to be resolved and that means a call to the database. This performane issue explained above can be optimize by database accesses with batching. It’s a strategy used for a GraphQL server and its database, but also for other programming environments.
Lets set up a Graphql server to see how this can be done.
//array of students representing a model in the database
const students = [
{
id: 1,
name: "Jamie Bones",
regNumber: "10/EF/EC/148",
dept: "Chemical Engineering",
courses: ["chm 303", "chm 409", "Bio 345"],
},
{
id: 2,
name: "Jacklie Tommy",
regNumber: "10/EF/EC/149",
dept: "Chemical Engineering",
courses: ["chm 303", "chm 409"],
},
{
id: 3,
name: "Strawbelly Government",
regNumber: "10/EF/EC/145",
dept: "Mechanical Engineering",
courses: ["chm 303", "chm 409", "Bio 345", "Math 567", "GST 212"],
},
{
id: 4,
name: "Uduak Johnson",
regNumber: "10/EF/CD/146",
dept: "Physics",
courses: ["chm 303", "chm 409", "Bio 345", "Phy 212"],
},
{
id: 5,
name: "Justin Fustain",
regNumber: "10/EF/CD/156",
dept: "Electrical Engineering",
courses: ["chm 303", "chm 409", "Bio 345", "Math 567"],
},
];
//courses model
const courses = [
{
id: 5,
courseCode: "Math 567",
courseTitle: "Industrial Mathematics for Nerds",
creditUnit: 3,
students: [3,5]
},
{
id: 4,
courseCode: "chm 303",
courseTitle: "Industrial Chemistry for Nerds",
creditUnit: 3,
students: [1,2,3,4,5]
},
{
id: 3,
courseCode: "chm 409",
courseTitle: "Organo metallic Chemistry",
creditUnit: 3,
students: [1,2,3,4,5]
},
{
id: 2,
courseCode: "Bio 345",
courseTitle: "Theory of Evolution",
creditUnit: 1,
students: [1, 3, 4, 5]
},
{
id: 1,
courseCode: "phy 212",
courseTitle: "Practical Physics",
creditUnit: 2,
students: [4]
},
];
//creating graphql schema and resolvers
const { ApolloServer, gql } = require("apollo-server");
const DataLoader = require("dataloader");
const schema = gql`
type Query {
student(name: String!): Student
allStudents: [Student]
course(courseCode: String!): Course
allCourses: [Course]
listStudentOfferingCourse(courseCode: String!): [Student]
}
type Student {
id: Int
name: String
dept: String
courses: [Course]
}
type Course {
id: Int
courseCode: String
creditUnit: Int
courseTitle: String
students: [Student]
}
`;
//creating our resolvers
const resolvers = {
Query: {
student: (parent, { name }, { students }) => {
return students.find((student) => student.name === name);
},
allStudents: (parent, {}, {}) => {
return students;
},
course: (parent, { courseCode }, { courses }) => {
return courses.find((course) => course.courseCode === courseCode);
},
allCourses: (parent, { courseCode }, { courses }) => {
return courses;
},
listStudentOfferingCourse: (
parent,
{ courseCode },
{ courses, students }
) => {
let offeringCourse = students.filter((student) => {
if (student.courses.includes(courseCode)) return student;
});
return offeringCourse;
},
},
Course: {
students: async (course, args, { students }) =>
{
let _students = students.filter((student) => {
if (student.courses.includes(course.courseCode)) {
return student;
}
});
return _students;
},
},
Student: {
courses: async (student, {}, { courses }) => {
return courses.filter(course => student.courses.includes(course.courseCode))
return await loaders.courses.loadMany(student.courses);
},
},
};
//Apollo server running on the default port.
const server = new ApolloServer({
typeDefs: schema,
resolvers,
context: async ({ req, connection, res }) => {
if (req) {
return {
req,
students,
courses,
};
}
},
});
server.listen().then(({ url }) => {
console.log(`🚀 Server ready at ${url}`);
});
//this is a normal graphql setup that enables us make queries to our server
`
From this minimalist setup we can perform queries as described by our schema. The problem arises when from our query Graphql tries to resolve a field in our schema.
`
//lets construct a simple query to retrieve the list of all students
query {
allStudents {
name
courses {
courseTitle
creditUnit
}
}
}
//running this query will give the following result below
{
"data": {
"allStudents": [
{
"name": "Jamie Bones",
"courses": [
{
"courseTitle": "Industrial Chemistry for Nerds",
"creditUnit": 3
},
{
"courseTitle": "Organo metallic Chemistry",
"creditUnit": 3
},
{
"courseTitle": "Theory of Evolution",
"creditUnit": 1
}
]
},
{
"name": "Jacklie Tommy",
"courses": [
{
"courseTitle": "Industrial Chemistry for Nerds",
"creditUnit": 3
},
{
"courseTitle": "Organo metallic Chemistry",
"creditUnit": 3
}
]
},
{
"name": "Strawbelly Government",
"courses": [
{
"courseTitle": "Industrial Chemistry for Nerds",
"creditUnit": 3
},
{
"courseTitle": "Organo metallic Chemistry",
"creditUnit": 3
},
{
"courseTitle": "Theory of Evolution",
"creditUnit": 1
},
{
"courseTitle": "Industrial Mathematics for Nerds",
"creditUnit": 3
},
null
]
},
{
"name": "Uduak Johnson",
"courses": [
{
"courseTitle": "Industrial Chemistry for Nerds",
"creditUnit": 3
},
{
"courseTitle": "Organo metallic Chemistry",
"creditUnit": 3
},
{
"courseTitle": "Theory of Evolution",
"creditUnit": 1
},
null
]
},
{
"name": "Justin Fustain",
"courses": [
{
"courseTitle": "Industrial Chemistry for Nerds",
"creditUnit": 3
},
{
"courseTitle": "Organo metallic Chemistry",
"creditUnit": 3
},
{
"courseTitle": "Theory of Evolution",
"creditUnit": 1
},
{
"courseTitle": "Industrial Mathematics for Nerds",
"creditUnit": 3
}
]
}
]
}
}
//An additional query is made to our datastore or database to resolve the courses field. This additional query is what is referred to as the n+1 problem.
`
The solution to solving this problem is the Dataloader package which was developed by Facebook and made open
sourced. It ensure that additional request are batched together and sent as a group instead of individually hitting the database.
`
//lets install the Dataloader package from npm
const DataLoader = require("dataloader");
//this is a function to load the data of students in a batch
//the function accepts an array of keys identifiers which can
//be your primary key that is unique to each row or documents in the database and it returns an array of values
const batchStudents = async (keys, studentModel) => {
//studentModel means the datastore
const students = await studentModel.filter((student) =>
keys.includes(student.id)
);
return keys.map((key) => students.find((student) => student.id === key));
};
const batchCourses = async (keys, courseModel) => {
const _courses = await courseModel.filter((course) =>
keys.includes(course.courseCode)
);
return keys.map((key) =>
_courses.find((course) => course.courseCode === key)
);
};
//lets update our Apollo Server Instance by passing our dataloaders to it
const server = new ApolloServer({
typeDefs: schema,
resolvers,
context: async ({ req, connection, res }) => {
if (req) {
return {
req,
students,
courses,
loaders: {
students: new DataLoader((keys) => batchStudents(keys, students)),
courses: new DataLoader((keys) => batchCourses(keys, courses)),
},
};
}
},
});
//we have added the loaders to our server instance
`
The loaders act as abstraction on top of the data models , and can be passed as context to the resolvers. The function gives you access to a list of keys in its arguments. These keys are your set of identifiers, purged of
duplication, which can be used to retrieve items from a database. That’s why keys (identifiers) and
models (data access layer) are passed to the batchStudents() function. The function then takes the keys
to retrieve the entities via the model from the database. By the end of the function, the keys are
mapped in the same order as the retrieved entities. Otherwise, it’s possible to return users right after
their retrieval from the database, though they have a different order than the incoming keys. As a
result, students need to be returned in the same order as their incoming identifiers (keys).
`
//previous content removed for brevity
const resolvers = {
Course: {
students: async (course, args, { loaders }) => { //using the dataloader passed to context
return await loaders.students.loadMany(course.students);
},
},
Student: {
courses: async (student, {}, { loaders }) => {
return await loaders.courses.loadMany(student.courses);
},
},
};
//we can also make use of loader.load which loads a single key
`
Using this loader ensures that unnecessary database trip is not taken but rather keys/identifiers are batched per request to the server and executed and result returned as a group.
Thanks for reading...
Top comments (0)