How could one parse a xml-like string and convert it a separated list?
I am trying to convert the following string:
<Categories>
<Category Assigned="0">
6 Level
<Category Assigned="1">
6.2 Level
<Category Assigned="0">
6.3 Level
<Category Assigned="0">
6.4 Level
<Category Assigned="1">
6.5 Level
</Category>
</Category>
</Category>
</Category>
</Category>
</Categories>
To a separated list like:
6 Level/6.2 Level/6.3 Level/6.4 Level/6.5 Level, 6 Level/6.2 Level
Robin Mills of exiv2 provided a perl script: http://dev.exiv2.org/boards/3/topics/1912?r=1923#message-1923
That would need to also parse Assigned="1"
. How can this be done in C++ to use in digikam, inside dmetadata.cpp
with a structure like:
QStringList ntp = tagsPath.replaceInStrings("<Category Assigned="0">", "/");
I don't have enough programming background to figure this out, and haven't found any code sample online that do something similar. I'd also like to include the code in exiv2 itself, so that other applications can benefit.
Working code will be included in digikam: https://bugs.kde.org/show_bug.cgi?id=345220
The code you have linked makes use of Perl's
XML::Parser::Expat
module, which is a glue layer on top of James Clark's Expat XML parser.If you want to follow the same route you should write C++ that uses the same library, but it can be clumsy to use as the API is via callbacks that you specify to be called when certain events in the incoming XML stream occur. You can see them in the Perl code, commented
process an start-of-element event
etc.Once you have linked to the library, it should be simple to write C code that is equivalent to the Perl in the callbacks — they are only a single line each. Please open a new question if you are having problems with understanding the Perl
Note also that Expat is a non-validating parser, which will let through malformed data without comment
Given that the biggest task is to parse the XML data in the first place, you may prefer a different solution that allows you to build an in-memory document structure from the XML data, and interrogate it using the Document Object Model (DOM). The
libxml
library allows you to do that, and has its own Perl glue layer in theXML::LibXML
module