TPath ignore case when accessing file [Java TrueZip]

403 Views Asked by At

Is there a way to access the file inside archive while ignoring file name case using TrueZip?

Imagine following zip archive with content:

MyZip.zip
-> myFolder/tExtFile.txt
-> anotherFolder/TextFiles/file.txt
-> myFile.txt
-> anotherFile.txt
-> OneMOREfile.txt

This is how it works:

TPath tPath = new TPath("MyZip.zip\\myFolder\\tExtFile.txt");
System.out.println(tPath.toFile().getName()); //prints tExtFile.txt 

How to do the same but ignore all case, like this:

// note "myFolder" changed to "myfolder" and "tExtFile" to "textfile"    
TPath tPath = new TPath("MyZip.zip\\myfolder\\textfile.txt");
System.out.println(tPath.toFile().getName()); // should print tExtFile.txt

Code above throws FsEntryNotFoundException ... (no such entry)

It works for regular java.io.File, not sure why not for TFile of TrueZip or I am missing something?

My goal is to access each file just using only lowercase for files and folders.

Edit: 24-03-2017

Let's say I would like to read bytes from file inside mentioned zip archive MyZip.zip

Path tPath = new TPath("...MyZip.zip\\myFolder\\tExtFile.txt");
byte[] bytes = Files.readAllBytes(tPath); //returns bytes of the file 

This snippet above works, but this one below does not (throws mentioned -> FsEntryNotFoundException). It is the same path and file just in lowercase.

Path tPath = new TPath("...myzip.zip\\myfolder\\textfile.txt");
byte[] bytes = Files.readAllBytes(tPath);
2

There are 2 best solutions below

5
kriegaex On BEST ANSWER

You said:

My goal is to access each file just using only lowercase for files and folders.

But wishful thinking will not get you very far here. As a matter of fact, most file systems (except Windows types) are case-sensitive, i.e. in them it makes a big difference if you use upper- or lower-case characters. There you can even have the "same" file name in different case multiple times in the same directory. I.e. it actually makes a difference if the name is file.txt, File.txt or file.TXT. Windows is really an exception here, but TrueZIP does not emulate a Windows file system but a general archive file system which works for ZIP, TAR etc. on all platforms. Thus, you do not have a choice whether you use upper- or lower-case characters, but you have to use them exactly as stored in the ZIP archive.


Update: Just as a little proof, I logged into a remote Linux box with an extfs file system and did this:

~$ mkdir test
~$ cd test
~/test$ touch file.txt
~/test$ touch File.txt
~/test$ touch File.TXT
~/test$ ls -l
total 0
-rw-r--r-- 1 group user 0 Mar 25 00:14 File.TXT
-rw-r--r-- 1 group user 0 Mar 25 00:14 File.txt
-rw-r--r-- 1 group user 0 Mar 25 00:14 file.txt

As you can clearly see, there are three distinct files, not just one.

And what happens if you zip those three files into an archive?

~/test$ zip ../files.zip *
  adding: File.TXT (stored 0%)
  adding: File.txt (stored 0%)
  adding: file.txt (stored 0%)

Three files added. But are they still distince files in the archive or just stored under one name?

~/test$ unzip -l ../files.zip
Archive:  ../files.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2017-03-25 00:14   File.TXT
        0  2017-03-25 00:14   File.txt
        0  2017-03-25 00:14   file.txt
---------                     -------
        0                     3 files

"3 files", it says - quod erat demonstrandum.

As you can see, Windows is not the whole world. But if you copy that archive to a Windows box and unzip it there, it will only write one file to a disk with NTFS or FAT file system - which one is a matter of luck. Very bad if the three files have different contents.


Update 2: Okay, there is no solution within TrueZIP for the reasons explained in detail above, but if you want to work around it, you can do it manually like this:

package de.scrum_master.app;

import de.schlichtherle.truezip.nio.file.TPath;

import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.Files;

public class Application {
  public static void main(String[] args) throws IOException, URISyntaxException {
    TPathHelper tPathHelper = new TPathHelper(
      new TPath(
        "../../../downloads/powershellarsenal-master.zip/" +
          "PowerShellArsenal-master\\LIB/CAPSTONE\\LIB\\X64\\LIBCAPSTONE.DLL"
      )
    );
    TPath caseSensitivePath = tPathHelper.getCaseSensitivePath();
    System.out.printf("Original path: %s%n", tPathHelper.getOriginalPath());
    System.out.printf("Case-sensitive path: %s%n", caseSensitivePath);
    System.out.printf("File size: %,d bytes%n", Files.readAllBytes(caseSensitivePath).length);
  }
}
package de.scrum_master.app;

import de.schlichtherle.truezip.file.TFile;
import de.schlichtherle.truezip.nio.file.TPath;

import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.Path;

public class TPathHelper {
  private final TPath originalPath;
  private TPath caseSensitivePath;

  public TPathHelper(TPath tPath) {
    originalPath = tPath;
  }

  public TPath getOriginalPath() {
    return originalPath;
  }

  public TPath getCaseSensitivePath() throws IOException, URISyntaxException {
    if (caseSensitivePath != null)
      return caseSensitivePath;
    final TPath absolutePath = new TPath(originalPath.toFile().getCanonicalPath());
    TPath matchingPath = absolutePath.getRoot();
    for (Path subPath : absolutePath) {
      boolean matchFound = false;
      for (TFile candidateFile : matchingPath.toFile().listFiles()) {
        if (candidateFile.getName().equalsIgnoreCase(subPath.toString())) {
          matchFound = true;
          matchingPath = new TPath(matchingPath.toString(), candidateFile.getName());
          break;
        }
      }
      if (!matchFound)
        throw new IOException("element '" + subPath + "' not found in '" + matchingPath + "'");
    }
    caseSensitivePath = matchingPath;
    return caseSensitivePath;
  }
}

Of course, this is a little ugly and will just get you the first matching path if there are multiple case-insensitive matches in an archive. The algorithm will stop searching after the first match in each subdirectory. I am not particularly proud of this solution, but it was a nice exercise and you seem to insist that you want to do it this way. I just hope you are never confronted with a UNIX-style ZIP archive created on a case-sensitive file system and containing multiple possible matches.

BTW, the console log for my sample file looks like this:

Original path: ..\..\..\downloads\powershellarsenal-master.zip\PowerShellArsenal-master\LIB\CAPSTONE\LIB\X64\LIBCAPSTONE.DLL
Case-sensitive path: C:\Users\Alexander\Downloads\PowerShellArsenal-master.zip\PowerShellArsenal-master\Lib\Capstone\lib\x64\libcapstone.dll
File size: 3.629.294 bytes
6
HRgiger On

I dont have TrueZip installed but I was also wondering how it would work in normal Path, so I implemented below way quite similar @kriegaex solution, you can try using caseCheck(path):

public class Main {

    /**
     * @param args
     * @throws Exception
     */
    public static void main(String[] args) throws Exception {

        Path path = Paths.get("/home/user/workspace/JParser/myfolder/yourfolder/Hisfolder/a.txt");

        Instant start = Instant.now();
        Path resolution;
        try{
            resolution = caseCheck(path);
        }catch (Exception e) {
            throw new IllegalArgumentException("Couldnt access given path", e);
        }
        Instant end = Instant.now();

        Duration duration = Duration.between(start, end);

        System.out.println("Path is: " + resolution + " process took " + duration.toMillis() + "ms");

    }

    /**
     * @param path
     * @return
     * @throws IOException
     */
    private static Path caseCheck(Path path) throws IOException {

        Path entryPoint = path.isAbsolute() ? path.getRoot() : Paths.get(".");
        AtomicInteger counter = new AtomicInteger(0);
        while (counter.get() < path.getNameCount()) {
            entryPoint = Files
                    .walk(entryPoint, 1)
                    .filter(s -> checkPath(s, path, counter.get()))
                    .findFirst()
                    .orElseThrow(()->new IllegalArgumentException("No folder found"));

            counter.getAndIncrement();

        }

        return entryPoint;

    }

    /**
     * @param s
     * @param path
     * @param index
     * @return
     */
    private static final boolean checkPath(Path s, Path path, int index){
        if (s.getFileName() == null) {
            return false;
        }
        return s.getFileName().toString().equalsIgnoreCase(path.getName(index).toString());
    }
}