I'm trying to recreate a root file that contains a char* type branch (which is interpreted by uproot as AsStrings())
When using mktree uproot doesn't recognize the np.dtype('string') and when trying np.dtype('S') I get:
TypeError: cannot write NumPy dtype |S0 in TTree
Is it possible to do this, or is it simply not implemented in the package?
"Strings" is not one of the data types that WritableTTree supports. See the blue box under https://uproot.readthedocs.io/en/latest/basic.html#writing-ttrees-to-a-file for a full list.
However, it's possible to write some string-like data. Awkward Arrays of strings are just lists of
uint8
type with special metadata (the__array__: "strings"
parameter) indicating that it should be interpreted as a string. There are actually two types,"string"
and"bytestring"
, in which we assume that the former is UTF-8 encoded and the latter is not.These data can be written to ROOT files by removing the parameters from the array, so that it looks like a plain array of integers:
Here's a way to write these data into a ROOT file:
When you read them back, the
uint8_t*
array could be cast as achar*
array, but watch out! The strings are not null-terminated (end with a\x00
byte). Many string-interpreting functions in C and C++ won't be expecting that. There are some functions, like strncpy andstd::string
's two-argument constructor, that can be given string length information so that they don't look for a null-terminator. The string length information is the counter branch,nbranch
in the above.I recognize that that's unpleasant. I just opened a feature request on Uproot for writing string data in a natural way, using ROOT's
TLeafC
, rather than this hack.