delete and replace strings matching pattern in xml file

237 Views Asked by At

Team, I have an xml file that needs to be modified for a personal hack project. Need help how can I club all this in fewer operations.

all below works on my mac but I need to see if I can club them into one line command. example for similar operations is there an OR operation I can do in sed syntax? something like sed sed -i '' '/classes|lines>|package|source|<source/d' coverage.xml

secondly, an answer below suggests to use xmlstarlet xmlstarlet edit -d '//package' coverage.xml with this approach it is deleting the entire block where I just want to delete the line not the block because am formatting it. see how i am renaming class to file and then making that class as parent node.!

sed -i '' '/classes/d' coverage.xml
sed -i '' '/lines>/d' coverage.xml
sed -i '' '/<method/d' coverage.xml
sed -i '' '/package/d' coverage.xml
sed -i '' '/source/d' coverage.xml
sed -i '' '/<source/d' coverage.xml

replace

sed -i '' 's/class/file/g' coverage.xml

replace at line having <coverage

sed -i '' '/<coverage/s/version="2.0.3"/version="1"/g' coverage.xml 

sample output to be converted. Note this output is being sent to sonarqube server that does not accept this format. so I am manually modifying it to send what sonarqube accepts and it works fine. but here in this question my ask is about how I can achieve that.

<?xml version="1.0" ?>
<!DOCTYPE coverage
SYSTEM 'http://cobertura.sourceforge.net/xml/coverage-04.dtd'>
<coverage branch-rate="0.0" branches-covered="0" branches-valid="0" complexity="0" line-rate="0.5936254980079682" lines-covered="447" lines-valid="753" timestamp="1672197709" version="2.0.3">
    <packages>
        <package line-rate="0.7614678899082569" branch-rate="0.0" name="src.services.secrets.keys" complexity="0">
            <classes>
                <class branch-rate="0.0" complexity="0" filename="src/services/guava/keys/keys.go" line-rate="0.7614678899082569" name="src.services.guava.keys.keys.go">
                    <methods/>
                    <lines>
                        <line branch="false" hits="0" number="109"/>
                        <line branch="false" hits="0" number="123"/>
                    </lines>
                </class>
            </classes>
        </package>
        <package line-rate="0.5944055944055944" branch-rate="0.0" name="src.services.guava.vault" complexity="0">
            <classes>
                <class branch-rate="0.0" complexity="0" filename="src/services/secrets/vault/vault.go" line-rate="0.5944055944055944" name="src.services.guava.vault.vault.go">
                    <methods/>
                    <lines>
                        <line branch="false" hits="1" number="251"/>
                        <line branch="false" hits="1" number="253"/>
                    </lines>
                </class>
            </classes>
        </package>
    </packages>
</coverage>

expected output below that works with sonarqube

<?xml version="1.0" ?>
<!DOCTYPE coverage
  SYSTEM 'http://cobertura.sourceforge.net/xml/coverage-04.dtd'>
<coverage branch-rate="0.0" branches-covered="true" branches-valid="0" complexity="0" lineToCover-rate="0.5936254980079682" lineToCovers-covered="447" lineToCovers-valid="753" timestamp="1672173715" version="1">
        <file branch-rate="0.0" complexity="0" path="src/services/guava/keys/keys.go" lineToCover-rate="0.7614678899082569" name="src.services.guava.keys.keys.go">
                <lineToCover branch="false" covered="false" lineNumber="16"/>
                <lineToCover branch="false" covered="false" lineNumber="17"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/guava/server/server.go" lineToCover-rate="0.744" name="src.services.guava.server.server.go">
                <lineToCover branch="false" covered="false" lineNumber="153"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/cmd_get.go" lineToCover-rate="0.0" name="src.services.guava.tools.keyrotate.cmd_get.go">
                <lineToCover branch="false" covered="true" lineNumber="16"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/cmd_rotate.go" lineToCover-rate="0.72" name="src.services.guava.tools.keyrotate.cmd_rotate.go">
                <lineToCover branch="false" covered="false" lineNumber="75"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/secrets/tools/keyrotate/main.go" lineToCover-rate="0.0" name="src.services.guava.tools.keyrotate.main.go">
                <lineToCover branch="false" covered="true" lineNumber="80"/>
                <lineToCover branch="false" covered="true" lineNumber="81"/>
        </file>
        <file branch-rate="0.0" complexity="0" path="src/services/guava/vault/vault.go" lineToCover-rate="0.5944055944055944" name="src.services.guava.vault.vault.go">
                <lineToCover branch="false" covered="false" lineNumber="251"/>
                <lineToCover branch="false" covered="false" lineNumber="253"/>
        </file>
</coverage>
2

There are 2 best solutions below

2
Muhammad Ali On

I hope, it works for your solution,

import re
with open('./stackoverflow_xml_replace.txt', 'r') as f:
    lines = f.readlines()
    for index, line in enumerate(lines):
        if '<coverage' in lines[index]:
            lines[index] = lines[index].replace('version="2.0.3"', 'version="1"')
        if 'class' in lines[index]:
            lines[index] = re.sub(r"\bclass\b","file",lines[index])
        if 'filename' in lines[index]:
            lines[index] = re.sub(r"\bfilename\b","path",lines[index])
        if '<line' in lines[index]:
            lines[index] = lines[index].replace('line', 'lineToCover')
        if 'number' in lines[index]:
            lines[index] = re.sub(r"\bnumber\b","lineNumber",lines[index])
        if '<source' in lines[index]:
            lines.remove(lines[index])
with open('./stackoverflow_xml_replace_output.txt', 'w') as f:
    f.writelines(lines)
print(lines)
5
Gilles Quénot On

To edit an XML file, you have special tool like xmlstarlet that let you edit the XML as well. Please, forget using sed and regex to parse XML. Better use a real XML parser and .

To move nodes

xmlstarlet ed --move '/path/to/node//childs' '/another/path' file.xml 

To delete a node:

xmlstarlet edit -d '//node' file.xml

To edit a node and @version attribut:

 xmlstarlet ed -Lu '/coverage/@version' -v '1' file.xml

will update the version with good practices.

The -L is the same as sed -i '': edit in-place

To install on Mac:

brew install xmlstarlet