Python xmltodict

535 Views Asked by At

I'm trying to read an XML file to do version upgrades on software. Specifically, the XML file contains a set of incremental upgrades, which may have required database updates as SQL queries. The problem is that I'm not familiar enough with reading XML files to find out how to make this work properly. Below is a sample of the XML file and the python (2.7) code I'm using to try and read it.

Essentially the python code would check the @release version and if it is greater than the existing software/database version, run the SQL code. (I can easily add in the code to execute the query, once I know how to get the SQL string into a variable.)

Can anyone give me some pointers where I may be going wrong here?

<?xml version="1.0"?>
<UNITNAME>
    <!--Version Release-->
    <databaseversion>1.0</databaseversion>
    <softwareversion>2.0</softwareversion>
    <released>2018-11-07</released>
    <!--Database Upgrade Queries-->
    <Database>
        <version release="1.0">
            <sql>UPDATE `system_defaults` SET `default_value`=1.0 WHERE `default_name`='DATABASEVERSION';</sql>
            <sql>UPDATE `system_defaults` SET `default_value`=1.0 WHERE `default_name`='SOFTWAREVERSION';</sql>
        </version>
        <version release="2.0">
            <sql>UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='DATABASEVERSION';</sql>
            <sql>UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='SOFTWAREVERSION';</sql>
        </version>
    </Database>
</UNITNAME>

And the python code: The variable 'f' of course is the urlopen file location on the internet (and it loads fine).

            data=f.read()
            f.close()
            data=xmltodict.parse(data)
            if isinstance(data, dict):
                for k0, v0 in data["EM83D"]["Database"].iteritems():
                    #print k0, v0
                    if isinstance(v0, (list, dict)):
                        #print v0
                        for k1, v1 in v0.iteritems():
                            #print k1, v1
                            if isinstance(v1, (list, dict)):
                                print v1
                                for k2, v2 in v1.iteritems():
                                    print k2, v2['#text']
1

There are 1 best solutions below

2
On

You should be using BeautifulSoup for this task, here is an example of how to get the 'release' attribute from the xml code;

from bs4 import BeautifulSoup as bs

xml = '''<?xml version="1.0"?>
<UNITNAME>
    <!--Version Release-->
    <databaseversion>1.0</databaseversion>
    <softwareversion>2.0</softwareversion>
    <released>2018-11-07</released>
    <!--Database Upgrade Queries-->
    <Database>
        <version release="1.0">
            <sql>UPDATE `system_defaults` SET `default_value`=1.0 WHERE `default_name`='DATABASEVERSION';</sql>
            <sql>UPDATE `system_defaults` SET `default_value`=1.0 WHERE `default_name`='SOFTWAREVERSION';</sql>
        </version>
        <version release="2.0">
            <sql>UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='DATABASEVERSION';</sql>
            <sql>UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='SOFTWAREVERSION';</sql>
        </version>
    </Database>
</UNITNAME>'''

soup = bs(xml, 'html.parser') # or replace 'html.parser' with 'lxml' if installed (its much faster)

database_version = soup.find('databaseversion').text
software_version = soup.find('softwareversion').text

if float(software_version) > float(database_version):
    sql_statements = soup.find('version', attrs={'release': float(software_version)})
    for query in sql_statements.find_all('sql'):
        print(query.text)
else:
    print('The database does not need to be updated.')

So if software version is greater than database version this code will print;

UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='DATABASEVERSION';
UPDATE `system_defaults` SET `default_value`=2.0 WHERE `default_name`='SOFTWAREVERSION';