Let's say I have a big gzipped file data.txt.gz
, but often the ungzipped version needs to be given to a program. Of course, instead of creating a standalone unpacked data.txt
, one could use the process substitution syntax:
./program <(zcat data.txt.gz)
However, depending on the situation, this can be tiresome and error-prone.
Is there a way to emulate a named process substitution? That is, to create a pseudo-file data.txt
that would 'unfold' into a process substitution zcat data.txt.gz
whenever it is accessed. Not unlike a symbolic link forwards a read operation to another file, but, in this case, it needs to be a temporary named pipe.
Thanks.
Edit (from comments) The actual use-case is having a large gzipped corpus that, besides its usage in its raw form, also sometimes needs to be processed with a series of lightweight operations (tokenized, lowercased, etc.) and then fed to some "heavier" code. Storing a preprocessed copy wastes disk space and repeated retyping the full preprocessing pipeline can introduce errors. In the same time, running the pipeline on-the-fly incurs a tiny computational overhead, hence the idea of a long-lived pseudo-file that hides the details under the hood.
As far as I know, what you are describing does not exist, although it's an intriguing idea. It would require kernel support so that opening the file would actually run an arbitrary command or script instead.
Your best bet is to just save the long command to a shell function or script to reduce the difficulty of invoking the process substitution.