Well in C++ codecvt/locale library is there a proper facet one could use to test if a character "is" something? IE to test if a character is any form of linebreaking character, or represents a numeric or a whitespace etc etc?
Or would one have to go manually/use rely on regex for this?
Yes, using the
std::ctype
facet and itsis
method:The available classification masks can be found here.
There isn't a classification category for line breaking characters; for that, you'll need to use ICU
u_getIntPropertyValue
with theUCHAR_LINE_BREAK
and check forU_LB_MANDATORY_BREAK
, etc.