Can there be a (Java 7) FileSystem for which a Path .isAbsolute() but has a null root?

785 Views Asked by At

The javadoc for .isAbsolute() says:

Tells whether or not this path is absolute.
An absolute path is complete in that it doesn't need to be combined with other path information in order to locate a file.

Returns: true if, and only if, this path is absolute

The javadoc for .getRoot() says:

Returns the root component of this path as a Path object, or null if this path does not have a root component.

Returns: a path representing the root component of this path, or null

OK, so, I am at a loss here; are there any filesystems out there for which a path may be absolute without a root at all?


EDIT: note that there CAN be paths which have a root but are NOT absolute. For instance, these on Windows systems:

  • C:foo;
  • \foo\bar.

But I am asking for the reverse here: no root and absolute.

3

There are 3 best solutions below

7
On BEST ANSWER

The Definition

The interface states the following about roots:

A root component, that identifies a file system hierarchy, may also be present.

So as you see, the comment seems to imply that roots are used for file system hierarchies. Now we have to reason about what an absolute path is. The interface tells us the following:

An absolute path is complete in that it doesn't need to be combined with other path information in order to locate a file.

So, as you see, there is no word about roots in the definition about absolute paths. The only restriction is that we have to be able to locate the file without further information.

Hierarchical File Systems

Most file system are hierarchical, i.e., they are trees (or graphs if we consider links) or forests. The root in a tree is a node that is not the child of another node (excluding links). Windows file systems are, for example, forests, as they have many roots (C:,D:,...). Linux has usually only one root which is /. Roots are very important as without them it would be hard to start locating a file. In such file systems, you can usually rely on each absolute path having a root.

Non-Hierarchical File Systems

As long as we have a hierarchical file system, we can anticipate a root in an absolute path, but what if we don't have one? Then, an absolute path might not contain a root.

An example that comes to my mind: Distributed file systems like Chord. These are often not hierarchical so the meaning of roots is usually undefined. Instead, a file hash identifies a file (SHA-1 in Chord). So a valid Chord path might look like this:

cf23df2207d99a74fbe169e3eba035e633b65d94

This is an absolute path. One can retrieve the associated file without further information, so the path is absolute. However, I see no root. We could define the whole hash to be its own root (then each file would be its own root), but nobody can guarantee that every person that implements a Chord file system will agree to this. So there might be reasonable implementations that do not treat these hashes as roots. In such a file system, each path would be absolute, but none would contain a root.

If I would implement a non-hierarchical file system, I would always return null as root, as IMHO a root is not a defined concept in a non-hierarchical file system. Since I think like this, other devs might think so as well. Consequently, you may not assume that every absolute path has a root.

Note that distributed file systems are quite common in many areas, so this is not merely a corner case that will never be implemented. I think you have to anticipate it.

Conclusion

  1. The interface does not mandate that each absolute path must have a root
  2. There are reasonable file systems in which having no root makes sense
  3. An Oracle tutorial as mentioned in the comments is no contract for the interface. You should not rely on this

So there will be people implementing file systems without roots; you should anticipate this.

2
On

Well, there are some obscure things with file systems. I made a few enterprise search crawlers, and somewhere down the road you will notice some strange file system things going on with paths. BTW: these are all implementations of custom (overridden) file systems, so no standard ones, and you can definitely argue for hours on what of those things are good ideas and what are not... Still, I don't think you'll encounter any of these cases with the standard file systems.

Here goes a few examples of strange things:

Files in container file systems (OLE2, ZIP, TAR, etc): c:\foo\bar\blah.zip\myfile

In this case, you can decide what item is 'the root':

  • 'c:\' ? That's not the root of the zip file containing the file...
  • 'c:\foo\bar\blah.zip' ? It might be the root of the file, but by doing that it might break your application.
  • 'blah.zip' ? Might be the root of the zip file - but regardless this might probably break your application as well.
  • '/' ? As in the '/' folder in the zip file? It might be possible, but that will give you a serious headache in the long run.

'graph' like structures like HTTP:

  • The fact that you have '/foo/bar' doesn't imply that '/foo' or even '/' exists. (Suppose that meets your criterium). The only thing you can do is walk the graph...
  • Note that protocols like WebDav are HTTP based and can give you a similar headache. I have some examples here of custom webdav file systems that don't have a 'root' folder, but do have absolute paths.

Still, you can argue that the top-most common path (if that exists...) that you can reach is the root or that there is a root - but you simply cannot reach it (even though it's really non-existent).

Samba/netbios

If you see a complete Samba (windows networking) network as a single file system, then you basically end up with a 'root' containing all workgroups, a workgroup containing all computers, a computer containing all shares, and then the files in the share.

However... the root and the workgroups don't really exist. They are things that are made up from a broadcast protocol (which is also quite unreliable if you have a network of over 1000 computers). From a crawler perspective, it makes all the sense in the world to treat the 'root' and 'workgroup' directories completely different from the (reliable) rest.

However

These scenario's describe only paths where the root is unreachable, unreliable or something else. Theoretically, I suppose that in any URL you can think of, there is always a root. After all, it's made up as a string of characters defining a hierarchy, which therefore by definition has a start.

3
On

A Question of Semantics

From my understanding of the subject, an absolute path can only be absolute if it can be traced back to it's root. As such there should never be an absolute path without a root. Ultimately this just comes down to semantics and although we can find definitions that define the absolute path such (egs below);

The only real question left after this point is whether the definition by the Java API follows suit. The only place I can find reference to the definition of an absolute path (with reference to the root element) from an official Oracle source is from inside the official Java tutorial. The official Java tutorials say

An absolute path always contains the root element

If this statement is to be believed, then no file system (no matter how obscure) can contain a Path that the Java API will consider absolute, unless it also considers it to contain a root.

You could argue that in some non-heirarchical file systems you might fall into some issues deciding whether a file can be it's own root. However, by this definition in the Path API (emphasis mine), a path should not represent a non-hierarchical element;

A Path represents a path that is hierarchical and composed of a sequence of directory and file name elements