이노베이션 5주차 2 - 댓글 정렬, Fetch join, @Query

미니 프로젝트/이노베이션 5주차

이노베이션 5주차 2 - 댓글 정렬, Fetch join, @Query

구너드 2023. 7. 8. 02:00

요구사항에는 Post와 Comment를 작성일을 기준으로 정렬해야한다고 적혀져 있었다. 최근에 작성된 Post와 Comment일 수록 상단에 위치하게끔 바꿔야 하는데, 지금은 그 반대로 정렬되어있다. Post같은 경우는 CreatedAt을 이용한 쿼리 메서드로 쉽게 정렬할 수 있었지만, Comment같은 경우에는 고민이 되었다. 연관관계를 생각하면서 해당 Comment의 정렬방식을 가장 간편하게 표현할 수 있는 부분에 대해서 생각이 많았던 것 같다.

우선 해당 Post를 조회하면 Comment들 모두 조회하게끔 하기 위한 작업을 PostResponseDto에서 직접적으로 작성해주는 게 좋을 것 같았다. Post는 List라는 컬렉션으로 Comment를 가지고 있기 때문에 반복문 혹은 스트림을 사용해야하는데. 해당 Comment를 CommentResposneDto로 변환하는 방식으로 틀을 만들고자 했다.

    public PostResponseDto(Post post) {
        this.id = post.getId();
        this.title = post.getTitle();
        this.username = post.getUsername();
        this.description = post.getDescription();
        this.createdAt = post.getCreatedAt();
        this.modifiedAt = post.getModifiedAt();
        this.commentList = post.getCommentList().stream()
                .map(CommentResponseDto::new)
                .collect(Collectors.toList());
    }

다만 이 부분을 그대로 반환하면 Comment의 정렬방식이 오래된 작성일 순으로 정렬되기 때문에 해당 부분을 역순으로 정렬해야했다.

    public PostResponseDto(Post post) {
        this.id = post.getId();
        this.title = post.getTitle();
        this.username = post.getUsername();
        this.description = post.getDescription();
        this.createdAt = post.getCreatedAt();
        this.modifiedAt = post.getModifiedAt();
        this.commentList = post.getCommentList().stream()
                .map(CommentResponseDto::new)
                .sorted(Collections.reverseOrder())
                .collect(Collectors.toList());
    }

위와 같이 수정하면 댓글의 순서도 역정렬이 될 거 같았다. 하지만 해당 코드를 돌려본 결과, ClassCastException 예외가 발생했다. 해당 문제의 원인을 찾아본 결과, 생성한 CommentResposDto는 정렬기준이 없기 때문에 Comparable을 구현하면 해당 sorted를 사용할 수 있다는 점을 알 수 있었다.

@Getter
public class CommentResponseDto implements Comparable<CommentResponseDto> {

    private Long id;
    private String content;
    private String username;
    private LocalDateTime createdAt;
    private LocalDateTime modifiedAt;

    public CommentResponseDto(Comment comment) {
        this.id = comment.getId();
        this.content = comment.getContent();
        this.username = comment.getUsername();
        this.createdAt = comment.getCreatedAt();
        this.modifiedAt = comment.getModifiedAt();
    }

    @Override
    public int compareTo(CommentResponseDto other) {
        return this.createdAt.compareTo(other.getCreatedAt());
    }
}

위의 CommentResponseDto는 Comparable을 구현하여 sorted를 사용할 수 있게 되었다. 이러한 해결방법을 통해 원하는 순서대로 comment를 정렬할 수 있었다.

그러나 생각해보면 CommentResponseDto에서 Comparable 구현하는 것도 방법이지만, 해당 부분은 DB에서 데이터를 가지고 와서, 애플리케이션 차원해서 재정렬하는 방법이다. CommentList는 ArrayList이기 때문에 내부 구성 요소들의 순서 변경에는 생각보다 리소스 소모가 많을 것이라고 예상되었다. 때문에 애플리케이션 차원에서 ArrayList를 재정렬하는 게 아닌, DB에서 직접 역순정렬된 데이터로 가지고 오는 것이 성능적으로 좀 더 우수하지 않을까라는 결론을 내렸다. 이 부분을 고려하여 DB에서 직접 역순정렬된 comment를 가져오는 고민하였고 해당 고민과정에서 추가적으로 발생할 수 있는 문제인 n + 1도 해결해보는 식으로 방향을 설정하였다.

기존 PostResponseDto에서 sorted를 제외하고 PostRepository에서 fetch join을 적용할 수 있는 법을 구글링해본 결과, @Query 애너테이션을 이용하면 해당 문제를 해결할수 있을 거 같았다.

    @Query("select distinct p from Post p join fetch p.commentList cl order by cl.createdAt desc")
    List<Post> findAllByOrderByCreatedAtAtDesc();

처음에 작성했던 방식

하지만 테스트를 돌려본 결과 comment가 작성되지 않은 Post는 조회되지않고, comment의 작성시간에 따라 Post의 정렬순서가 뒤바뀌는 문제가 발생하였다. Post가 조회되지 않은 이유는 join fetch가 outer join이 아닌 inner join으로 적용한게 원인이었고, 댓글의 작성일에 따라 Post의 순서가 재정렬된 것은 order by를 commentlist의 생성날짜를 내림차순으로 설정하였기 때문이었다. 해당 문제를 인지하고 다시 JPQL을 수정했다.

    @Query("select distinct p from Post p left join fetch p.commentList cl order by p.createdAt desc, cl.createdAt desc")
    List<Post> findAllByOrderByCreatedAtAtDesc();

수정한 방식

left join fetch를 이용하여 comment가 작성되지 않은 post도 가져올 수 있고, 다중정렬을 이용하여 원하는 순서에 맞 데이터를 가지고 올 수 있었다. 생각보다 간단한 문제였고, join fetch로 인해 DB에 직접 날라가는 쿼리를 확인해보니 n + 1의 문제도 발생하지 않았다. 해당 부분의 문제는 잘 해결할 수 있었으나, join fetch와 batchsize 중 어떤 것이 좀 더 합리적일까 하는 궁금증이 생겼다.

fetch join / Batch Szie

fetch join - 한 번의 쿼리로 여러 테이블의 데이터를 가져오는 방법. 여러 쿼리를 실행해야하는 상황에서 발생할 수 있는 부하를 감소시킬 수 있음

Batch Size - 데이터를 처리할 때, 한 번에 가져오는 레코드의 수를 의미. 일반적으로 데이터베이스 시스템은 모든 데이터를 메모리에 로드하는 것보다 작은 배치 단위로 데이터를 가져옴. Batch Size설정을 통해 쿼리 실행 시 한 번에 가져오는 데이터 양을 제어하는 파라미터 설정 가능