GroupBy and Stream Filter based on max field for particular record on each date java list

444 Views Asked by At

I am having a list that looks like below

[
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-27 05:22:46",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 1,
    "currentSpeed": 67,
    "passengersNo": 80
  },
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-27 07:12:10",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 2,
    "currentSpeed": 56,
    "passengersNo": 11
  },
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-26 03:12:10",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 3,
    "currentSpeed": 31,
    "passengersNo": 15
  },
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-26 03:12:10",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 3,
    "currentSpeed": 78,
    "passengersNo": 15
  },
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-26 04:34:10",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 7,
    "currentSpeed": 49,
    "passengersNo": 57
  }
]

What i would like to achieve is to filter like for example on date 2022-08-27 and also on date 2022-08-26 respectively i only return one record which is having the maximum currentSpeed such that my final list looks like below

[
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-27 05:22:46",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 1,
    "currentSpeed": 67,
    "passengersNo": 80
  },
  {
    "busId": "4-3323309834",
    "dataDateTime": "2022-08-26 03:12:10",
    "busName": "27Tr Solaris Single Deck",
    "speedLimit": 80,
    "id": 3,
    "currentSpeed": 78,
    "passengersNo": 15
  }
]

Below is how my code looks like and the way i am populating my list

public static void main(String[] args) {

        List<HashMap<String, Object>> myList = new ArrayList<>();

        myList.add(new HashMap<>(Map.of("id", 1,
                "busId","4-3323309834",
                "busName","27Tr Solaris Single Deck",
                "currentSpeed",67,
                "passengersNo",80,
                "speedLimit",80,
                "dataDateTime","2022-08-27 05:22:46")));

        myList.add(new HashMap<>(Map.of("id",2,
                "busId","4-3323309834",
                "busName","27Tr Solaris Single Deck",
                "currentSpeed",56,
                "passengersNo",11,
                "speedLimit",80,
                "dataDateTime","2022-08-27 07:12:10")));

        myList.add(new HashMap<>(Map.of(
                "id",3,
                "busId","4-3323309834",
                "busName","27Tr Solaris Single Deck",
                "currentSpeed",31,
                "passengersNo",15,
                "speedLimit",80,
                "dataDateTime","2022-08-26 03:12:10")));

        myList.add(new HashMap<>(Map.of(
                "id",3,
                "busId","4-3323309834",
                "busName","27Tr Solaris Single Deck",
                "currentSpeed",78,
                "passengersNo",15,
                "speedLimit",80,
                "dataDateTime","2022-08-26 03:12:10")));

        myList.add(new HashMap<>(Map.of(
                "id",7,
                "busId","4-3323309834",
                "busName","27Tr Solaris Single Deck",
                "currentSpeed",49,
                "passengersNo",57,
                "speedLimit",80,
                "dataDateTime","2022-08-26 04:34:10")));

    }

Below is what i am trying to use to filter but i a getting Not a statement in below filter code

        List<HashMap<String, Object>> myList2 = myList.stream()
                .collect(Collectors.groupingBy(hashmap ->
                                List.of(hashmap.get("busId"),
                                        hashmap.get("currentSpeed"),
                        Collectors.maxBy(Comparator.comparing(HashMap::get("dataDateTime")))));




        System.out.println(Config.ANSI_CYAN + "MyListToJsonPrint: " +
                new GsonBuilder().setPrettyPrinting().create().toJson(myList2));

Is there an efficient way that i can use to filter

3

There are 3 best solutions below

3
Eritrean On BEST ANSWER

If you need the map having the max currentSpeed for each day, then you need to group by day and map to the hashmap having the max value of current speed. You can use LocalDateTime from the java.time API to parse your dates so that it is easier to use your map values as a classifier for grouping.

Basically the steps you want are:

  • Stream over your list
  • collect to map using the date as key
  • map to the hashmap having the max value of currentSpeed using a BinaryOperator and a Comparator
  • the above steps will result in a Map<LocalDate, HashMap<String, Object>>
  • and finally get the values from the above map

Code:

DateTimeFormatter dtf = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");

List<HashMap<String, Object>> result = new ArrayList<>(
        myList.stream()
              .collect(Collectors.toMap(map -> LocalDateTime.parse((String) map.get("dataDateTime"), dtf).toLocalDate(),
                                        Function.identity(),
                                        BinaryOperator.maxBy(Comparator.comparingInt(map -> (int)map.get("currentSpeed")))))
              .values());

result.forEach(System.out::println);

You could also use Collectors.groupingBy in combination with Collectors.reducing but you need then to unwrap the optional resulting from the reducing step

List<HashMap<String, Object>> result2 = new ArrayList<>(
myList.stream()
       .collect(Collectors.groupingBy(map -> LocalDateTime.parse((String) map.get("dataDateTime"), dtf).toLocalDate(),
                                      Collectors.collectingAndThen(
                                              Collectors.reducing(BinaryOperator.maxBy(Comparator.comparingInt(map -> (int)map.get("currentSpeed")))),
                                              Optional::get))).values());

result2.forEach(System.out::println);
13
yezper On

First of all I would recommend to you to work with POJOs to encapsulate data instead of working with Map<String, Object>, because this will make working with the data much more easy and robust as you will not have to worry about extracting values by key and then casting them to the correct data type.

Other than that I can show you a working solution for the problem, that is similar to yours. For my solution I was assuming that each Map always contains the key dataDateTime and currentSpeed with valid values. I was also assuming that all dates are in ISO date format YYYY-MM-DD.

private static List<Map<String, Object>> transform(List<Map<String, Object>> busses, List<LocalDate> dates ){

        return busses.stream()
                // filtering out irrelevant dates
                .filter(bus -> dates.contains(LocalDate.parse(((String) bus.get("dataDateTime")).substring(0, 10))))
                // grouping by the date
                .collect(groupingBy(bus -> ((String) bus.get("dataDateTime")).substring(0, 10)))
                .values().stream()
                // for all busses with the same date get the one that has the max currentSpeed
                .map(sameDateBusses ->
                        sameDateBusses.stream().max(Comparator.comparingInt(bus -> (Integer) bus.get("currentSpeed"))).get()
                // finally make the Stream a List
                ).toList();
    }

You can then call the utility method like this

transform(myList, List.of(LocalDate.of(2022, 8, 27), LocalDate.of(2022, 8, 26)));

Please note: In terms of efficiency I am pretty sure the problem can be solved in linear time. So if performance is a big deal to you, than there is definitely a better algorithm that you can use.

0
Alexander Ivanchenko On

Update

If you insist on quick and dirty approach using Map, that how it can be done (the overall logic is the same as explained below):

public static Collection<Map<String, Object>> getFastestByDateInClosedRange(List<Map<String, Object>> routes,
                                                                            LocalDate start,
                                                                            LocalDate end) {
    return routes.stream()
        .filter(busRoute -> isInClosedRange(
            LocalDateTime.parse((String) busRoute.get("dataDateTime"), formatter).toLocalDate(), start, end))
        .collect(Collectors.toMap(
            busRoute -> busRoute.get("dataDateTime"),
            Function.identity(),
            BinaryOperator.maxBy(Comparator.comparing(
                busRoute -> (Integer) busRoute.get("currentSpeed")
            ))
        ))
        .values();
}

Use the Power of Objects

It's an antipattern to treat object related data as a collection of string.

Such practice make your code unfeasible and error-prone. Process marshalling and unmarshalling the data in some cases might involve some degree of complexity, but when you have objects on your hands easier to implement and test any logic you need, it doesn't matter how elaborate it is.

Let's declare an object:

@AllArgsConstructor
@Getter
@ToString
public static class BusRoute {
    private int id;
    private String busId;
    private String busName;
    private int currentSpeed;
    private int speedLimit;
    private LocalDateTime dataDateTime;
}

Parsing JSON

And here's how we can deserialize JSON-array with Gson using GsonBuilder and TypeToken:

String routesJson = """
    [
        // your JSON
    ]""";
        
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");
        
Gson gson = new GsonBuilder().registerTypeAdapter(LocalDateTime.class, new JsonDeserializer<LocalDateTime>() {
    @Override
    public LocalDateTime deserialize(JsonElement json, Type type, JsonDeserializationContext jsonDeserializationContext) throws JsonParseException {
        return LocalDateTime.parse(json.getAsJsonPrimitive().getAsString(), formatter);
    }
}).create();
        
Type type = new TypeToken<ArrayList<BusRoute>>() {
        }.getType();
        
List<BusRoute> routes = gson.fromJson(routesJson, type);

Fetching the Data with Streams

To find BusRoute in the given range of dates, we need to filter the dates, and the group the data into a Map. For that purpose we can use collector toMap which expects a mergeFunction as the third argument.

The same can be done using groupingBy(), but it would require two collectors instead of one, and since our goal is to associate a key with a single value and not a collection of value semantically it's the more correct choice (also regarding groupingBy vs toMap see here):

public static Collection<BusRoute> getFastestByDateInClosedRange(List<BusRoute> routes,
                                                                 LocalDate start,
                                                                 LocalDate end) {
    return routes.stream()
        .filter(busRoute -> isInClosedRange(busRoute.getDataDateTime().toLocalDate(), start, end))
        .collect(Collectors.toMap(
            busRoute -> busRoute.getDataDateTime().toLocalDate(),
            Function.identity(),
            BinaryOperator.maxBy(Comparator.comparingInt(BusRoute::getCurrentSpeed))
        ))
        .values();
}

public static boolean isInClosedRange(LocalDate candidate, LocalDate start, LocalDate end) {
    
    return candidate.isEqual(start) || candidate.isEqual(end) ||
        candidate.isAfter(start) && candidate.isBefore(end);
}

The following line

getFastestByDateInClosedRange(routes, LocalDate.parse("2022-08-26"), LocalDate.parse("2022-08-27"))
            .forEach(System.out::println);

with would give the following output for sample data provided in the question:

GroupOfSetsUniqueCombinations.BusRoute(id=1, busId=4-3323309834, busName=27Tr Solaris Single Deck, currentSpeed=67, speedLimit=80, dataDateTime=2022-08-27T05:22:46)
GroupOfSetsUniqueCombinations.BusRoute(id=3, busId=4-3323309834, busName=27Tr Solaris Single Deck, currentSpeed=78, speedLimit=80, dataDateTime=2022-08-26T03:12:10)