Filesystem speedup - 'find' command

532 Views Asked by At

I would like to speedup queries like: find root_dir -atime -5 - find files which were accessed less then five days ago.

I think about storing filesystem hieratchy and files metadata in a db. Do you know any solution that can help ? Maybe there exists a fuse filesystem that can do this ?

3

There are 3 best solutions below

0
On

There is updatedb and locate (same package), but it needs to be run manually for updates and doesn't appear to be able to search by filestamps. If you'll be writing your own solution, it may still be a good starting point though.

0
On

Such a find command will cause the traversed directory data and inodes to be in the filesystem cache, which is why the same command is usually very much faster when you run it the second time . The same holds for other commands who traverse the file system, like du.

Please observe that the buildup of your database would also at least take the time a find takes. Not to mention synchronization runs to reflect file system changes in your database. Chances are a complete rebuild of the database would be the fastest method to do this.

So, what I would do is to run a find through the parts of the file system that interest you (maybe with a periodic cron job). This way, in some sense, you build up an in memory "database" and subsequent find, du and stuff like that will run faster.

0
On

There's no "real good" solution to do this. You could run a cron script once per hour, or per day, that creates a database like yours, but if you run this too often, you'll impose a large load on your filesystem; if you don't run it often enough, your results will be outdated.

The other way would be using a kernel mechanism to inform your program about file system changes. Look at http://en.wikipedia.org/wiki/Inotify to get started. However, this is linux specific, and it allows you to monitor specific directories only, not the whole filesystem.