Issue
I found out some interesting thing.
When @OneToMany
relationship in JPA, there should be N+1 issues.
We need to use fetch join
in JPQL or @EntityGraph
annotation to avoid performance issue.
But, we encounter other problem which is overlapping entities because of Cartesian product
.
fetch join
becomes inner join
and @EntityGraph
becomes left outer join
in SQL.
So we have to using distinct
in JPQL or Set
data structure in Java.
Here is my question.
When using fetch join
, there is overlapping entities problem.
However, when using @EntityGraph
annotation, we can't see overlapping entities problem.
Let me show you a example. Here is my data set.
post.id | post.content | post.title |
---|---|---|
1 | this is the first post. | first post |
reply.id | reply.content | reply.post_id |
---|---|---|
1 | first-reply-1 | 1 |
2 | first-reply-2 | 1 |
3 | first-reply-3 | 1 |
4 | first-reply-4 | 1 |
5 | first-reply-5 | 1 |
6 | first-reply-6 | 1 |
7 | first-reply-7 | 1 |
8 | first-reply-8 | 1 |
9 | first-reply-9 | 1 |
10 | first-reply-10 | 1 |
And when we query like this.
select *
from test.post inner join test.reply on test.post.id = test.reply.post_id;
We expect data like this.
But @EntityGraph
annotation doesn't work like this.
post.id | post.content | post.title | reply.id | reply.content | reply.post_id |
---|---|---|---|---|---|
1 | this is the first post. | first post | 1 | first-reply-1 | 1 |
1 | this is the first post. | first post | 2 | first-reply-2 | 1 |
1 | this is the first post. | first post | 3 | first-reply-3 | 1 |
1 | this is the first post. | first post | 4 | first-reply-4 | 1 |
1 | this is the first post. | first post | 5 | first-reply-5 | 1 |
1 | this is the first post. | first post | 6 | first-reply-6 | 1 |
1 | this is the first post. | first post | 7 | first-reply-7 | 1 |
1 | this is the first post. | first post | 8 | first-reply-8 | 1 |
1 | this is the first post. | first post | 9 | first-reply-9 | 1 |
1 | this is the first post. | first post | 10 | first-reply-10 | 1 |
Test Code
Post Entity
package blog.in.action.post;
import blog.in.action.reply.Reply;
import lombok.*;
import javax.persistence.*;
import java.util.ArrayList;
import java.util.List;
@Builder
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
@Entity
public class Post {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private long id;
@Column
private String title;
@Column
private String content;
@OneToMany(mappedBy = "post")
private List<Reply> replies;
public void addReply(Reply reply) {
if (replies == null) {
replies = new ArrayList<>();
}
replies.add(reply);
}
}
Reply Entity
package blog.in.action.reply;
import blog.in.action.post.Post;
import lombok.*;
import javax.persistence.*;
@Builder
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
@Entity
public class Reply {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private long id;
@Column
private String content;
@ManyToOne
@JoinColumn(name = "post_id")
private Post post;
}
PostRepository repository
package blog.in.action.post;
import org.springframework.data.jpa.repository.EntityGraph;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.Query;
import java.util.List;
import java.util.Set;
public interface PostRepository extends JpaRepository<Post, Long> {
@Query(value = "SELECT p FROM Post p JOIN FETCH p.replies WHERE p.title = :title")
List<Post> findByTitleFetchJoinWithoutDistinct(String title);
@EntityGraph(attributePaths = {"replies"})
@Query(value = "SELECT p FROM Post p WHERE p.title = :title")
List<Post> findByTitleEntityGraphWithoutDistinct(String title);
}
PostRepositoryTest tests
package blog.in.action.post;
import blog.in.action.reply.Reply;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.autoconfigure.orm.jpa.DataJpaTest;
import javax.persistence.EntityManager;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;
import static org.assertj.core.api.AssertionsForClassTypes.assertThat;
@DataJpaTest
public class PostRepositoryTest {
@Autowired
private EntityManager em;
@Autowired
private PostRepository postRepository;
Post getPost(String title, String content) {
return Post.builder()
.title(title)
.content(content)
.build();
}
void insertReply(Post post, String content) {
for (int index = 0; index < 10; index++) {
Reply reply = Reply.builder()
.content(content + index)
.post(post)
.build();
post.addReply(reply);
em.persist(reply);
}
}
@BeforeEach
public void setup() {
Post post = getPost("first post", "this is the first post.");
Post secondPost = getPost("second post", "this is the second post.");
postRepository.save(post);
postRepository.save(secondPost);
insertReply(post, "first-reply-");
insertReply(secondPost, "second-reply-");
em.flush();
em.clear();
}
@Test
public void whenFindByTitleFetchJoinWithoutDistinct_thenJustOneQuery() {
List<Post> posts = postRepository.findByTitleFetchJoinWithoutDistinct("first post");
assertThat(posts.size()).isEqualTo(10);
}
@Test
public void whenFindByTitleEntityGraphWithoutDistinct_thenJustOneQuery() {
List<Post> posts = postRepository.findByTitleEntityGraphWithoutDistinct("first post");
assertThat(posts.size()).isEqualTo(1);
}
}
whenFindByTitleFetchJoinWithoutDistinct_thenJustOneQuery test
- log
select post0_.id as id1_0_0_,
replies1_.id as id1_1_1_,
post0_.content as content2_0_0_,
post0_.title as title3_0_0_,
replies1_.content as content2_1_1_,
replies1_.post_id as post_id3_1_1_,
replies1_.post_id as post_id3_1_0__,
replies1_.id as id1_1_0__
from post post0_
inner join reply replies1_ on post0_.id = replies1_.post_id
where post0_.title = ?
whenFindByTitleEntityGraphWithoutDistinct_thenJustOneQuery test
- log
select post0_.id as id1_0_0_,
replies1_.id as id1_1_1_,
post0_.content as content2_0_0_,
post0_.title as title3_0_0_,
replies1_.content as content2_1_1_,
replies1_.post_id as post_id3_1_1_,
replies1_.post_id as post_id3_1_0__,
replies1_.id as id1_1_0__
from post post0_
left outer join reply replies1_ on post0_.id = replies1_.post_id
where post0_.title = ?
Is there someone to know about this?
Full test code link
Solution
For Entity Graphs distinct applies by default. Filtering performs on the java side by the framework using Identity Set.
It was implemented in 5.2.10 version HHH-11569
See source code
org/hibernate/hql/internal/ast/QueryTranslatorImpl.java
public List list(SharedSessionContractImplementor session, QueryParameters queryParameters)
throws HibernateException {
...
final boolean needsDistincting = (
query.getSelectClause().isDistinct() ||
getEntityGraphQueryHint() != null || //In case query has Entity Graph HINT applies distincting of result records
hasLimit )
&& containsCollectionFetches();
...
if ( needsDistincting ) {
int includedCount = -1;
// NOTE : firstRow is zero-based
int first = !hasLimit || queryParameters.getRowSelection().getFirstRow() == null
? 0
: queryParameters.getRowSelection().getFirstRow();
int max = !hasLimit || queryParameters.getRowSelection().getMaxRows() == null
? -1
: queryParameters.getRowSelection().getMaxRows();
List tmp = new ArrayList();
IdentitySet distinction = new IdentitySet();
for ( final Object result : results ) {
if ( !distinction.add( result ) ) {
continue;
}
includedCount++;
if ( includedCount < first ) {
continue;
}
tmp.add( result );
// NOTE : ( max - 1 ) because first is zero-based while max is not...
if ( max >= 0 && ( includedCount - first ) >= ( max - 1 ) ) {
break;
}
}
results = tmp;
}
In the case of HQL or JPQL queries you can use PASS_DISTINCT_THROUGH hint to have the same effect. Details are described in The best way to use the JPQL DISTINCT keyword with JPA and Hibernate as mentioned Simon.
Answered By - Eugene
Answer Checked By - Dawn Plyler (JavaFixing Volunteer)