Spring Data JPA DistinctBy projections

665 Views Asked by At

Good day fellow hibernators! I have a question on how the DistinctBy clause works in conjunction with Spring Data's projection

Assume I have 3 classes:

public class Task {
  Long id;
  @ManyToOne(fetch = LAZY)
  @JoinColumn(name = "project_id")
  private Project project;
  @OneToOne
  @JoinColumn(name = "contact_id")
  private Contact assigned;
  Boolean deleted;
  // ...
}

public class Contact {
  Long id;
  // ...
}

public class Project { 
  Long id;
  @OneToMany(fetch = LAZY, mappedBy = "project")
  private Set<Task> tasks;
  // ...
}

These would be my domain classes. Notice, Project does have a "One2Many" to Tasks, Contact does not. Now, I have 2 interfaces for my projections and the basic TaskRepo with 2 methods:

public interface JustProject {
  Project getProject();
}

public interface JustAssignee {
  Contact getContact();
}

public class TaskRepo extends CrudRepository<Task, Long>, JpaSpecificationExecutor<Task> {
    List<JustAssignee> findDistinctByDeletedFalse();
    List<JustProject> findDistinctByDeletedFalseAndDeletedFalse();
}

The way it works for me right now is that, findDistinctByDeletedFalse returns as many instances as there are distinct contacts for tasks (e.g. if there are 10 tasks but only 3 contacts, the method will return just 3 objects containing all the 3 distinct contacts). Same for findDistinctByDeletedFalseAndDeletedFalse but on project level.

Now I have a few questions here and would love to get some help in understanding how this works exactly.

  • is the distinct clause applied after the search is done?

    • my initial assumption was that this behavior would not work as it does now. I assumed that the distinct clause is applied before the result is fetched, meaning that it would be DISTINCT based on the underlying task model, not the returned JustContact or JustProject model.
  • is there any way I could somehow not abuse the ...AndDeletedFalse redundant appendix? I need both the two methods from the repo but I feel like I had to cheat just to obtain that result...

  • ... am I doing something wrong? I wanted to get "all distinct contacts/projects assigned to all tasks" as elegant of a way as possible. I ended up thinking about this distinctby exactly because I was unsure on how it works and wanted to try mu luck out. I really didn't think it would work this way, but now that it does I would really want to understand why it does!

Many thanks <3

2

There are 2 best solutions below

0
On
  1. The DISTINCT keyword is applied to the query and therefore it's effect depends on the select list which in turn is controlled by the projection. Therefore if you have only project or only contact in your projection the DISTINCT will get applied to those values only. Note though, that this relies somewhat on the boundaries of the JPA specification and I wouldn't be surprised if you see different behaviour with different implementations. See https://github.com/eclipse-ee4j/jpa-api/issues/189 and https://github.com/eclipse-ee4j/jpa-api/issues/124 for somewhat related issues raised against the specification.

  2. In oder to differentiate methods that otherwise only differ in the return value you might add any additional string between find and By in the method name. For example you might want to rename your methods to findDistinctContactsByDeletedFalse and findDistinctProjectsByDeletedFalse

0
On

I guess this is the best that you can get with Spring Data JPA. You might be able to use just a single method by using the dynamic projections approach, but I think this is a perfect use case for Blaze-Persistence Entity Views.

I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.

A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:

@EntityView(Task.class)
public interface TaskAggregateDto {
    // A synthetic "id" to get a grouping context on object level
    @IdMapping("1")
    int getGroupKey();

    Set<ProjectDto> getProjects();
    Set<ContactDto> getContacts();

    @EntityView(Project.class)
    interface ProjectDto {
        @IdMapping
        Long getId();
        String getName();
    }
    @EntityView(Contact.class)
    interface ContactDto {
        @IdMapping
        Long getId();
        String getName();
    }
}

The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features

public interface TaskRepo extends CrudRepository<Task, Long>, JpaSpecificationExecutor<Task> {
    TaskAggregateDto findOneByDeletedFalse();
}