Java: Identify common paths in ArrayList<String> using Lambdas

385 Views Asked by At

I got an array of elements like :

ArrayList<String> t = new ArrayList();
t.add("/folder1/sub-folder1");
t.add("/folder2/sub-folder2");
t.add("/folder1/sub-folder1/data");

I need to get output as /folder1/sub-folder1 which is mostly repeated path.

In python this can be achieved using the below function:

   def getRepeatedPath(self, L):
         """ Returns the highest repeated path/string in a provided list """
         try:
             pkgname = max(g(sorted(L)), key=lambda(x, v): (len(list(v)), -L.index(x)))[0]
             return pkgname.replace("/", ".")
         except:
             return "UNKNOWN"

I am trying to work on equivalent lambda function in Java. I got struck and need some help in the lambda implementation.

public String mostRepeatedSubString(ArrayList<String> pathArray) {
   Collections.sort(pathArray);
   String mostRepeatedString = null;
    Map<String,Integer> x = pathArray.stream.map(s->s.split("/")).collect(Collectors.toMap()); 
    return mostRepeatedString;
}
1

There are 1 best solutions below

12
On BEST ANSWER

Lots of tweaking, but I finally got it!

  public static void main(String[] args) {
    ArrayList<String> t = new ArrayList<String>();
    t.add("folder1/sub-folder1");
    t.add("folder2/sub-folder2");
    t.add("folder1/sub-folder1/data");
    System.out.println(mostRepeatedSubString(t));
  }

  public static String mostRepeatedSubString(List<String> pathArray) {
    return pathArray
      .stream()
      // Split to lists of strings
      .map(s -> Arrays.asList(s.split("/")))
      // Group by first folder
      .collect(Collectors.groupingBy(lst -> lst.get(0)))
      // Find the key with the largest list value
      .entrySet()
      .stream()
      .max((e1, e2) -> e1.getValue().size() - e2.getValue().size())
      // Extract that largest list
      .map(Entry::getValue)
      .orElse(Arrays.asList())
      // Intersect the lists in that list to find maximal matching
      .stream()
      .reduce(YourClassName::commonPrefix)
      // Change back to a string
      .map(lst -> String.join("/", lst))
      .orElse("");
  }

  private static List<String> commonPrefix(List<String> lst1, List<String> lst2) {
    int maxIndex = 0;
    while(maxIndex < Math.min(lst1.size(), lst2.size())&& lst1.get(maxIndex).equals(lst2.get(maxIndex))) {
      maxIndex++;
    }

    return lst1.subList(0, maxIndex);
  }

Note that I had to remove the initial / from the paths, otherwise that character would have been used in the split, resulting in the first string in every path list being the empty string, which would always be the most common prefix. Shouldn't be too hard to do this in pre-processing though.