DEV Community

Cover image for JPA Mapping with Hibernate- Many-to-Many Relationship
AnkitDevCode
AnkitDevCode

Posted on

JPA Mapping with Hibernate- Many-to-Many Relationship

In the previous section, we discussed the One-to-Many and Many-to-One Relationship Now, let’s look at the Many-to-many relationship

Table of Contents

  1. Introduction
  2. The Relational Model Behind @ManyToMany
  3. Unidirectional @ManyToMany — Simplest Form
  4. Bidirectional @ManyToMany
  5. equals() and hashCode() — The Critical Foundation
  6. Intermediate Entity for Join Table With Extra Columns
  7. Fetch Strategies & The N+1 Problem
  8. Cascade Types — What to Use and When
  9. Serialization — Avoiding Infinite Recursion
  10. Performance Best Practices
  11. Quick Reference — Best Practices vs Pitfalls
  12. Conclusion

1. Introduction

A many-to-many (M:N) relationship is one of the most common yet most misunderstood associations in relational modeling. When mapped carelessly in JPA/Hibernate, it becomes a prime source of N+1 query problems, infinite JSON recursion, unnecessary eager loading, and subtle data-integrity bugs.

This article walks through every important aspect of @ManyToMany, from the simplest unidirectional form to a full intermediate-entity approach, and pairs every concept with the best practices that keep your application performant and maintainable.


2. The Relational Model Behind @ManyToMany

In a relational database, an M:N relationship is always implemented via a join table (also called a bridge or association table). For example, a Student ↔ Course relationship requires a student_course join table:

students           student_course         courses
──────────────     ─────────────────      ──────────────────
student_id (PK)    student_id  (FK) ─▶    course_id (PK)
name               course_id   (FK) ─▶    title
email              enrolled_at            credits
                   grade
Enter fullscreen mode Exit fullscreen mode

If the join table carries extra columns (enrolled_at, grade), you must model it as a separate entity — a plain @JoinTable annotation cannot capture those columns.


3. Unidirectional @ManyToMany — Simplest Form

Use this when only one side needs to navigate to the other, and the join table has no extra columns.

3.1 Mapping

@Entity
public class Student {
    @Id @GeneratedValue
    private Long id;
    private String name;

    @ManyToMany
    @JoinTable(
        name = "student_course",
        joinColumns = @JoinColumn(name = "student_id"),
        inverseJoinColumns = @JoinColumn(name = "course_id")
    )
    private Set<Course> courses = new HashSet<>();   // ← Set, never List
}

@Entity
public class Course {
    @Id @GeneratedValue
    private Long id;
    private String title;
    // No back-reference here → unidirectional
}
Enter fullscreen mode Exit fullscreen mode

Best Practice — Use Set, not List

Always use Set<> for @ManyToMany collections. Hibernate's handling of List<> in many-to-many associations can throw MultipleBagFetchException when fetching multiple bag collections in the same query, and may produce duplicate records.


4. Bidirectional @ManyToMany

Bidirectional mapping lets both sides navigate to each other. Exactly one side must be the owning side (holds @JoinTable); the other is the inverse side (uses mappedBy).

4.1 Mapping

@Entity
public class Student {                            // OWNING SIDE
    @ManyToMany
    @JoinTable(
        name = "student_course",
        joinColumns = @JoinColumn(name = "student_id"),
        inverseJoinColumns = @JoinColumn(name = "course_id")
    )
    private Set<Course> courses = new HashSet<>();
}

@Entity
public class Course {                            // INVERSE SIDE
    @ManyToMany(mappedBy = "courses")            // mappedBy is mandatory
    private Set<Student> students = new HashSet<>();
}
Enter fullscreen mode Exit fullscreen mode

Pitfall — Forgetting mappedBy

Without mappedBy on the inverse side, JPA creates two independent join tables and double-inserts every link row. Always declare mappedBy on exactly one side.

4.2 Keeping Both Sides in Sync

In a bidirectional relationship you must update both sides programmatically in your helper methods, because the in-memory state is independent of the database state until flush:

// Add a convenience method on the owning side
public void enroll(Course course) {
    this.courses.add(course);
    course.getStudents().add(this);   // keep inverse in sync
}

public void unenroll(Course course) {
    this.courses.remove(course);
    course.getStudents().remove(this);
}
Enter fullscreen mode Exit fullscreen mode

5. equals() and hashCode() — The Critical Foundation

Hibernate uses equals() and hashCode() to determine whether two entity instances represent the same row, especially when adding/removing from Set collections and when merging detached entities. The default Object identity implementation breaks all of this.

5.1 Correct Implementation (Business Key or UUID)

@Entity
public class Course {
    @Id @GeneratedValue
    private Long id;

    @NaturalId                     // Hibernate annotation
    @Column(nullable = false, unique = true)
    private String courseCode;     // e.g. "CS-101" — stable business key

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Course)) return false;
        Course other = (Course) o;
        return Objects.equals(courseCode, other.courseCode);
    }

    @Override
    public int hashCode() {
        return Objects.hashCode(courseCode);  // must be stable across states
    }
}
Enter fullscreen mode Exit fullscreen mode

Pitfall — Using id for hashCode

Never base hashCode on @Id if entities can be in a Set before being persisted. A transient entity has id = null, so its hashCode changes on persist, which silently corrupts any Set or HashMap that contained it.


6. Intermediate Entity for Join Table With Extra Columns

When the join table needs to store data (enrollment date, grade, seat number, etc.), replace the @ManyToMany shortcut with an explicit intermediate entity. This is the most robust and recommended pattern in production systems.

6.1 Composite Key Class

@Embeddable
public class EnrollmentId implements Serializable {
    @Column(name = "student_id")
    private Long studentId;

    @Column(name = "course_id")
    private Long courseId;

    // equals() + hashCode() required for @Embeddable PKs
    @Override public boolean equals(Object o) { ... }
    @Override public int hashCode() { ... }
}
Enter fullscreen mode Exit fullscreen mode

6.2 Enrollment (Intermediate) Entity

@Entity
@Table(name = "student_course")
public class Enrollment {

    @EmbeddedId
    private EnrollmentId id = new EnrollmentId();

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("studentId")
    private Student student;

    @ManyToOne(fetch = FetchType.LAZY)
    @MapsId("courseId")
    private Course course;

    @Column(nullable = false)
    private LocalDate enrolledAt;

    private BigDecimal grade;
}
Enter fullscreen mode Exit fullscreen mode

6.3 Parent Entities

@Entity
public class Student {
    @OneToMany(mappedBy = "student", cascade = CascadeType.ALL, orphanRemoval = true)
    private Set<Enrollment> enrollments = new HashSet<>();

    public void enroll(Course course, LocalDate date) {
        Enrollment e = new Enrollment();
        e.setStudent(this);
        e.setCourse(course);
        e.setEnrolledAt(date);
        enrollments.add(e);
    }
}

@Entity
public class Course {
    @OneToMany(mappedBy = "course")   // no cascade from Course side
    private Set<Enrollment> enrollments = new HashSet<>();
}
Enter fullscreen mode Exit fullscreen mode

Best Practice — Cascade only from the aggregate root

Apply CascadeType.ALL + orphanRemoval only on the owning aggregate root side (Student). Do not cascade from Course — it is a separate aggregate and should not delete enrollments when a course is touched.


7. Fetch Strategies & The N+1 Problem

Fetch strategy is the single most impactful performance decision in any JPA application.

7.1 Always Use LAZY — Never EAGER

// ✅ Correct — LAZY is the safe default
@ManyToMany(fetch = FetchType.LAZY)
private Set<Course> courses = new HashSet<>();

// ❌ Wrong — loads ALL courses for ALL students every time a Student is loaded
@ManyToMany(fetch = FetchType.EAGER)
private Set<Course> courses;
Enter fullscreen mode Exit fullscreen mode

7.2 Solving N+1 With JOIN FETCH

Even with LAZY loading, iterating a collection inside a loop produces one SQL query per iteration. Fix this with a JOIN FETCH JPQL query:

// N+1 — fires one extra query per student
List<Student> students = em.createQuery("SELECT s FROM Student s", Student.class)
                           .getResultList();
students.forEach(s -> s.getCourses().size()); // N hits

// ✅ Fixed — single JOIN query
List<Student> students = em.createQuery(
    "SELECT DISTINCT s FROM Student s JOIN FETCH s.courses",
    Student.class).getResultList();
Enter fullscreen mode Exit fullscreen mode

7.3 Using @BatchSize as a Middle Ground

@ManyToMany(fetch = FetchType.LAZY)
@BatchSize(size = 25)   // loads 25 students' courses in one IN (...) query
private Set<Course> courses = new HashSet<>();
Enter fullscreen mode Exit fullscreen mode

Pitfall — MultipleBagFetchException

You cannot JOIN FETCH two List<> collections in the same JPQL query. Hibernate throws MultipleBagFetchException. Fix: change both to Set<>, or fetch one in JPQL and use @BatchSize for the second.


8. Cascade Types — What to Use and When

Cascade types control which JPA lifecycle operations (PERSIST, MERGE, REMOVE, etc.) are propagated from parent to child.

8.1 Recommended Cascade Matrix

Scenario Cascade orphanRemoval Notes
Simple @ManyToMany (no extra cols) PERSIST, MERGE false Do NOT use REMOVE
Intermediate entity (owned) ALL true Only from aggregate root
Intermediate entity (shared) PERSIST, MERGE false Shared = don't remove
Course → Enrollment (inverse) (none) false Let Student own it

Pitfall — CascadeType.REMOVE on @ManyToMany

Using CascadeType.REMOVE (or ALL) on a plain @ManyToMany will delete the related entities themselves — not just the join row. Removing one Student will delete all their Course records from the courses table, affecting every other enrolled student.


9. Serialization — Avoiding Infinite Recursion

Bidirectional relationships create circular object graphs. When Jackson (or any JSON library) tries to serialize a Student that contains Courses, which contain Students, which contain Courses... it throws a StackOverflowError.

9.1 Jackson Annotations

// On the owning side (Student)
@JsonManagedReference
private Set<Course> courses;

// On the inverse side (Course)
@JsonBackReference
private Set<Student> students;  // this side is NOT serialized
Enter fullscreen mode Exit fullscreen mode

9.2 Better: Use DTOs (Recommended)

Never serialize JPA entities directly to your API layer. Use dedicated DTO/response classes:

// DTO — safe, no cycles, no Hibernate proxies
public record CourseDTO(Long id, String title, int credits) {
    public static CourseDTO from(Course c) {
        return new CourseDTO(c.getId(), c.getTitle(), c.getCredits());
    }
}

public record StudentDTO(Long id, String name, Set<CourseDTO> courses) {
    public static StudentDTO from(Student s) {
        return new StudentDTO(
            s.getId(), s.getName(),
            s.getCourses().stream().map(CourseDTO::from).collect(Collectors.toSet())
        );
    }
}
Enter fullscreen mode Exit fullscreen mode

10. Performance Best Practices

10.1 Projections and DTO Queries

For read-heavy endpoints, skip entity loading entirely and query directly into DTOs:

@Query("SELECT new com.example.dto.StudentCourseDTO(s.name, c.title) " +
       "FROM Student s JOIN s.courses c WHERE s.id = :studentId")
List<StudentCourseDTO> findCoursesByStudent(@Param("studentId") Long id);
Enter fullscreen mode Exit fullscreen mode

10.2 Pagination — Never Paginate With JOIN FETCH

// ❌ Wrong — Hibernate loads ALL rows into memory, then paginates
@Query("SELECT DISTINCT s FROM Student s JOIN FETCH s.courses")
Page<Student> findAll(Pageable pageable);   // issues HHH90003004 warning

// ✅ Correct — paginate the root entity, load collection separately
@Query(value = "SELECT s FROM Student s",
       countQuery = "SELECT COUNT(s) FROM Student s")
Page<Student> findAll(Pageable pageable);
// Then use @BatchSize or a second query to load courses
Enter fullscreen mode Exit fullscreen mode

10.3 Use @Transactional on Service, Not Repository

Keep your transactions at the service layer where the full unit of work is clear. Opening a transaction in a repository method gives you no control over lazy loading in the service.


11. Quick Reference — Best Practices vs Pitfalls

✅ Best Practice ❌ Pitfall to Avoid
Use @ManyToMany with intermediate entity for extra columns Using plain @JoinTable when join table has extra data
Set fetch = FetchType.LAZY on both sides Using FetchType.EAGER (causes N+1 queries)
Define owning side clearly with mappedBy on inverse Bidirectional mapping without mappedBy
Use Set<> instead of List<> to avoid duplicates Using List<> and getting MultipleBagFetchException
Use orphanRemoval + CascadeType.ALL on parent side only Cascading ALL on both sides (infinite loops / dual deletes)
Implement equals()/hashCode() based on business key Using default Object identity for equals/hashCode
Use @BatchSize or JOIN FETCH to load related data Loading collections in a loop (classic N+1 problem)
Use DTOs and projections for read-heavy queries Serializing full entity graphs to JSON (StackOverflow risk)

12. Conclusion

Many-to-many associations are powerful but require deliberate design. The key takeaways are:

  • Use Set<> — always. List<> in M:N leads to bags, duplicates, and MultipleBagFetchException.
  • Prefer intermediate entity — as soon as the join table has any extra column, model it explicitly.
  • Keep fetch=LAZY everywhere — solve loading problems with JOIN FETCH or @BatchSize, not EAGER.
  • Define equals()/hashCode() on a stable business key — never rely on the database-generated id.
  • Cascade carefullyREMOVE and orphanRemoval belong only on aggregate-root-owned children.
  • Use DTOs at the API layer — never serialize entity graphs directly to JSON.
  • Measure first — use Hibernate's statistics or a query logger (P6Spy/datasource-proxy) to confirm you have no N+1 queries before shipping.

Top comments (0)