I am storing data about files that exist on a OSX HFS+ filesystem. I later want to iterate over the stored data and figure out if each file still exists. For my purposes, I care about filename case sensitivity, so if the case of a filename has changed I would consider the file to no longer exist.
I started out by trying
os.path.isfile(filename)
but on a normal install of OSX on HFS+, this returns True even if the filename case does not match. I am looking for a way to write a isfile() function that cares about case even when the filesystem does not.
os.path.normcase() and os.path.realpath() both return the filename in whatever case I pass into them.
Edit:
I now have two functions that seem to work on filenames limited to ASCII. I don't know how unicode or other characters might affect this.
The first is based off answers given here by omz and Alex L.
def does_file_exist_case_sensitive1a(fname):
if not os.path.isfile(fname): return False
path, filename = os.path.split(fname)
search_path = '.' if path == '' else path
for name in os.listdir(search_path):
if name == filename : return True
return False
The second is probably even less efficient.
def does_file_exist_case_sensitive2(fname):
if not os.path.isfile(fname): return False
m = re.search('[a-zA-Z][^a-zA-Z]*\Z', fname)
if m:
test = string.replace(fname, fname[m.start()], '?', 1)
print test
actual = glob.glob(test)
return len(actual) == 1 and actual[0] == fname
else:
return True # no letters in file, case sensitivity doesn't matter
Here is a third based off DSM's answer.
def does_file_exist_case_sensitive3(fname):
if not os.path.isfile(fname): return False
path, filename = os.path.split(fname)
search_path = '.' if path == '' else path
inodes = {os.stat(x).st_ino: x for x in os.listdir(search_path)}
return inodes[os.stat(fname).st_ino] == filename
I don't expect that these will perform well if I have thousands of files in a single directory. I'm still hoping for something that feels more efficient.
Another shortcoming I noticed while testing these is that they only check the filename for a case match. If I pass them a path that includes directory names none of these functions so far check the case of the directory names.
This answer is just a proof of concept because it doesn't attempt to escape special characters, handle non-ASCII characters, or deal with file system encoding issues.
On the plus side, the answer doesn't involve looping through files in Python, and it properly handles checking the directory names leading up to the final path segment.
This suggestion is based on the observation that (at least when using bash), the following command finds the path
/my/path
without error if and only if/my/path
exists with that exact casing.(If brackets are left out of any path part, then that part will not be sensitive to changes in casing.)
Here is a sample function based on this idea: