Manipulate DOMDocument with PHP

73 Views Asked by sellmark At 29 July 2025 at 05:39

1&1 hosting service is leaving Poland, my country with their services, so they told every their client to move out. Because there's no way to export the website, I need to parse it manually and retrieve data I want.

Basically it's about to export all articles with image attachments.

I'm trying to manipulate the HTML from this site: http://www.naszeiganie.org/lata-2014-2015/ to have each post in the individual div element, to properly parse the whole document and retrieve mixed data, that articles have.

I figured that every article starts with a:

<div class="n module-type-header diyfeLiveArea "> <h2> <span class="diyfeDecoration">

and there's no repeatable end of the "article". In fact, next instance of above code is telling me, that current post is ending, and the new one begins.

function smi_parse_web(){
$url = 'http://www.naszeiganie.org/lata-2014-2015/';
$content     = file_get_contents($url);
$doc         = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($content);
libxml_clear_errors();
$finder = new DomXPath($doc);
$node   = $finder->query('//div[contains(@class,"module-type-header")]/h2');
foreach($node as $anchor){
    if($anchor->nodeName == 'h2')
        {
            $element = $doc->createElement('div', 'x');
            $element->setAttribute('class','DIV-WRAP');
            $element->insertBefore($anchor);
        }
}
echo $doc->saveHTML();

I figured out something like this, but the effect is none. The found $anchor clears out it's content.

My target is to find all the html content between one and another div > h2 combination and wrap it up in the div.wrap

What would you suggest to do to move on with the project? Maybe I've gone wrong while the simpliest way is on my hand?

Thanks a lot!

(I know how to deal with images, but I want them to be attached to each downloaded article)

Original Q&A

Manipulate DOMDocument with PHP

There are 0 best solutions below

Related Questions in PHP

Related Questions in HTML

Related Questions in DOM

Related Questions in DOMDOCUMENT

Related Questions in DOMXPATH

Trending Questions

Popular # Hahtags

Popular Questions