Would it be possible to store an ext3-like system in mysql? For example, to build a file metadata system that would include paths and permissions? If so, what would be an example format of it? I suppose the starting denormalized route would be here:
-rw-r--r--@ 1 david staff 552B Nov 2 15:55 wsgi.py
-rw-r--r-- 1 david staff 113B Oct 30 20:45 ~.bash_profile
So we would have 7 columns:
permissions | links | user | group | size | date | filename
I suppose we would need an additional two for 'management':
fileId | directoryId
Does this seem way off base, or would that be possible? What would be a better and more normalized way to store it?
Sure, you could crawl the filesystem and collect meta info about each file and directory, and then store it into a database. But that info cannot be a substitute for the filesystem's info, only a 'copy' of it.
I would have (at least) 2 tables:
Files
andDirectories
. A file would have a dir_id. A directory would have both a dir_id for itself and a parent_id for walking up the directory tree. The top of the tree (the "root") would be a zero or null.Soft links, hard links, devices, mounts, /proc, etc, would add challenges, but maybe you don't care about them?
size
needs to be aBIGINT
.permissions
, if encoded, might fit inSMALLINT UNSIGNED
, or could be stored as a string.user
andgroup
could be either the id or the string; you may need to have an id:name table for users and one for groups. Fordate
, considerTIMESTAMP
or maybeDATETIME
; keep in mind that the OS may be doing something closer toTIMESTAMP
to deal with timezones. (Windows is possibly different.)If you are storing copies of the files, then I recommend using another table, linked to
Files
viafile_id
. But, beware, aLONGBLOB
is limited to 4GB, and there are other settings that make it difficult to store anything bigger than 16MB. So, I might suggest chunking anything bigger than 64KB, compressing the chunks, etc. (This probably requires yet another table.)As for reconstructing the full path from the "hierarchy" that I suggested above, it is only a small amount of code. It could be done in your app code, or in a Stored Procedure. With MySQL 8.0 or MariaDB 10.2, "CTEs" are available to facilitate drilling down a tree.
(Yes, I have done most of these things in a couple of projects in the past.)