Is there anyway to make HTML Purifier strip elements with a certain attribute.
I'm using HTML Purifier to clean up a full webpage into just its basic content so I can index and search it.
I want to be able to add an attribute like data-no-index
to some wrapper to make them ignored.
This is my HTML Purifier setup:
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Allowed', 'h1,h2,h3,h4,h5,h6,p,a[href],ul,ol,li,img[src]');
$purifier = new HTMLPurifier($config);