I'm trying to convert an old HTML Site to a new CMS. To get the correct menu hierachy (with varying depth) I want to read all the files with PHP and extract/parse the menu (nested unordered lists) into an associative array
root.html
<ul id="menu">
<li class="active">Start</li>
<ul>
<li><a href="file1.html">Sub1</a></li>
<li><a href="file2.html">Sub2</a></li>
</ul>
</ul>
file1.html
<ul id="menu">
<li><a href="root.html">Start</a></li>
<ul>
<li class="active">Sub1</li>
<ul>
<li><a href="file3.html">SubSub1</a></li>
<li><a href="file4.html">SubSub2</a></li>
<li><a href="file5.html">SubSub3</a></li>
<li><a href="file6.html">SubSub4</a></li>
</ul>
</ul>
</ul>
file3.html
<ul id="menu">
<li><a href="root.html">Start</a></li>
<ul>
<li><a href="file1.html">Sub1</a></li>
<ul>
<li class="active">SubSub1</li>
<ul>
<li><a href="file7.html">SubSubSub1</a></li>
<li><a href="file8.html">SubSubSub2</a></li>
<li><a href="file9.html">SubSubSub3</a></li>
</ul>
</ul>
</ul>
</ul>
file4.html
<ul id="menu">
<li><a href="root.html">Start</a></li>
<ul>
<li><a href="file1.html">Sub1</a></li>
<ul>
<li><a href="file3.html">SubSub1</a></li>
<li class="active">SubSub2</li>
<li><a href="file5.html">SubSub3</a></li>
<li><a href="file6.html">SubSub4</a></li>
</ul>
</ul>
</ul>
I would like to loop through all files, extract 'id="menu"' and create an array like this (or similar) while keeping the hierarchy and file information
Array
[file] => root.html
[child] => Array
[Sub1] => Array
[file] => file1.html
[child] => Array
[SubSub1] => Array
[file] => file3.html
[child] => Array
[SubSubSub1] => Array
[file] => file7.html
[SubSubSub2] => Array
[file] => file8.html
[SubSubSub3] => Array
[file] => file9.html
[SubSub2] => Array
[file] => file4.html
[SubSub3] => Array
[file] => file5.html
[SubSub4] => Array
[file] => file6.html
[Sub2] => Array
[file] => file2.html
With the help of the PHP Simple HTML DOM Parser libray I successfully read the file and extracted the menu
$html = file_get_html($file);
foreach ($html->find("ul[id=menu]") as $ul) {
..
}
To only parse the active section of the menu (leaving out the links to got 1 or more levels up) I used
$ul->find("ul",-1)
which finds the last ul inside the outer ul. This works great for a single file.
But I'm having trouble to loop through all the files/menus and keep the parent/child information because each menu has a different depth.
Thanks for all suggestions, tips and help!
Edit: OK, this was not so easy after all :)
By the way, this library is really an excellent tool. Kudos to the guys who wrote it.
Here is one possible solution:
Usage:
Ouput of your test case:
Notes:
You could modify the code to have the (sub)menus as an array with numeric indexes and names as properties (so that two items with the same name would not overwrite each other), but that would complicate the structure of the result.
Should such name duplication occur, the best solution would be to rename one of the items, IMHO.
It could be modified to handle more than one, but that does not make much sense IMHO (it would mean a root menu ID duplication, which would likely cause trouble to the JavaScript trying to process it in the first place).