I'm in the process of writing a test suite for some lexing/parsing and it would be much cleaner if I could drop test input/output files in a directory and have dune generate OCaml test cases for each of these during a step in compilation.
I figured I could use dune for this, very much inspired by this documentation page (Preprocessors and PPXs), but I'm struggling at getting it to work. I've essentially come to 2 dead ends:
An alias rule that would execute a script padding each of the test files seemingly wouldn't work:
(tests (names lexer) (libraries llvmlexer llvmparser ounit2)) (rule (alias runtest) (deps (glob_files %{workspace_root}/**/*.ll)) (action (system "./preprocess-lexer.sh '%{input-file}'")))
As it errors with:
File "test/dune", line 9, characters 41-54:
9 | (action (system "./preprocess-lexer.sh '%{input-file}'")))
^^^^^^^^^^^^^
Error: %{input-file} isn't allowed in this position.
I'm very confused by this. Is this a matter of executing the action once for all files? If so is it possible to execute it once for each dependency?
- Neither would having all source/targets specified, as that would entail listing them all in the dune file as wildcard rules is apparently still not a thing: https://github.com/ocaml/dune/issues/307
Indeed, dune doesn't support wildcard rules at the time of writing. It has, however, very limited support for it tailored for preprocessing so that you can specify a rule of the following form
*.ml -> *.pp.ml
, exactly with these suffixes, e.g.,And then if you have a file
bar.ml
It will be preprocessed to a
bar.pp.ml
file, which will be dropped in the build directory and used instead ofbar.ml
. This is how this mechanism works and it is designed to work only with the OCaml source files. And if it suits you, you just need to fix the suffixes, i.e., you need to rename your.ll
files to.ml
and specify the preprocess stanza that uses you preprocessor instead ofcpp
that I have used in the example.The mechanism described above is called "preprocessing via user actions", which should be confused with the more general (and also using actions) custom rule stanza. The common use of this stanza is to define the rules of the form,
where
./tools/my_rewriter.sh
will receive the contents offoo.data.src
in stdin and everything it prints will be redirected tofoo.data
. (Note that./tools/my_rewriter.sh
is the path from the top-level of your project). You can't specify a wildcard, likeand expect it to be called for each file with the matching suffixes. Again, at the time of writing such a mechanism is not implemented in dune. You have, however two options as workarounds.
Option 1. Autogenerating the Rules
You can either rely on the OCaml Syntax and produce the dune file that contains such a rule replicated for each
*.data.src
in the folder. I wouldn't personally recommend this, as the status of the OCaml Syntax support is not clear and it might misbehave in general.Alternatively, you can add an extra stage to your build process, e.g., a
./configure
script that will generate the dune file with all these rules.You can also write them manually, of course :)
Option 2. Using Globs and Directory Dependencies
You can use
glob_files
and then change your action so that it takes a set of files and produce a set of files, e.g., using GNU parallel,And this rule for each
<foo>.data.in
will produce<foo>.data
. (Of course, you can write your own for loop, instead of using parallel).The caveat with this approach is that since this rule doesn't specify targets, then all produced files will be eventually deleted by dune. And the problem is that unlike
deps
thetargets
stanza doesn't acceptglob_files
, which perfectly makes sense, as the targets are not expected to exist at the time of rule application.For the rescue, we have the new
directory-targets
. To enable it, you need the following in yourdune-project
(the lang shall be greater than or equal to 3.0):Now you can put the test input data files that you would like to preprocess in the same folder as your test driver. In this case, I use
*.data.src
as the input files andtest_foo.ml
The
(run parallel cp {} data/{.} ::: %{deps})
will callcp <file>.data.src data/<file>.data
for each<file>
matching*.data.src
. You can substitute it with your command which takes the set of input files and populates it with the preprocessed files. This command could even be implemented in OCaml, just specify./path/to/your/tool.exe
as the command and dune will build it automatically from./path/to/your/tool.ml
.In this setup, whenever you change an input
*.data.src
file, or any other dependency of the test,dune test
will rebuild the data folder and correctly rerun the tests.For the sake of completeness, here is the contents of my
test_foo.ml
file,And here's a sample directory structure,
Feel free to poke me if you want to get a fully working example.