storing ext3 in a sql database

69 Views Asked by At

Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:

-rw-r--r--@  1 david  staff   552B Nov  2 15:55 wsgi.py
-rw-r--r--   1 david  staff   113B Oct 30 20:45 ~.bash_profile

So we would have 7 columns:

permissions | links | user | group | size | date | filename

I suppose we would need an additional two for 'management':

fileId | directoryId

Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?

1

There are 1 best solutions below

2
On

Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.

I would have (at least) 2 tables: Files and Directories. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.

Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?

size needs to be a BIGINT. permissions, if encoded, might fit in SMALLINT UNSIGNED, or could be stored as a string. user and group could be either the id or the string; you may need to have an id:name table for users and one for groups. For date, consider TIMESTAMP or maybe DATETIME; keep in mind that the OS may be doing something closer to TIMESTAMP to deal with timezones. (Windows is possibly different.)

If you are storing copies of the files, then I recommend using another table, linked to Files via file_id. But, beware, a LONGBLOB is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)

As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.

(Yes, I have done most of these things in a couple of projects in the past.)