In PyYaml or ruamel.yaml I'm wondering if there is a way to handle parsing of specific strings. Specifically, I'd like to be able to parse "[inf, nan]" as [float('inf'), float('nan')]. I'll also note that I would like "['inf', 'nan']" to continue to parse as ['inf', 'nan'], so it's just the unquoted variant that I'd like to intercept and change the current behavior.
I'm aware that currently I could use "[.inf, .nan]" or "[!!float inf, !!float nan]", but I'm curious if I could extend the Loader to allow for the syntax that I expected would have worked (but doesn't).
Perhaps I'm just making a footgun by allowing "nan" and "inf" to be parsed as floats rather than strings - and I'm interested in hearing compelling reasons that I should not allow for this type of parsing. But I'm not too woried about the case where other parses would parse my configs incorrectly (but maybe I'm underestimating the pain that will cause in the future). I plan to use this as a one way convineince in parsing arguments on the command line, and I don't expect actual config files to be written like this.
In any case I'd still be interested in how it could be done, even if the conclusion is that it shouldn't be done.
Based on the confusion that I have seen caused by
Yes,On,NoandOffbeing interpreted as boolean values in YAML 1.1, I don't think this is a good idea.But it is possible to do this both in
ruamel.yamland PyYAML, by changing the regex that recognises floats (i.e. that assigns the implicit tagtag:yaml.org,2002:floatto the scalar) and then to make sure the routine constructing a float from a scalar handles these additional scalars. The three main improvements (with regard to this) inruamel.yamlare that it has different regexes for YAML 1.1 and YAML 1.2 parsing (the latter being the default, the former having to be specified either by a directive, or by setting.versionon theYAML()instance); that the various Resolvers each have a copy of these regexes instead of sharing one (as in PyYAML, which makes having multiple, differently behaving parsers in one program difficult); and that regex compilation is delayed until they are actually needed.Given the differences, the following will only apply to
ruamel.yamlYou need to create a resolver, and replace its regex recognition for all floats, and then create a constructor that constructs the floats based on the recognised scalars:
which gives:
That
1.0is loaded as aScalarFloatis necessary to preserve its formatting when dumping. It is possible to preserve the different ways of writing.nan,.inf,nanandinfin a similar way, but you would have to make a special representer and either extendScalarFloator make one or more explicit types that keep the the original scalar string value. Either way you would lose the possibility to test withx is float('nan')which may be a problem in real programs (which is also the reason whyruamel.yamldoesn't preserve the different forms of null during round-trip).