I'm trying various options of creating a BHO for traversing HTML page DOM. One implementation uses C# and it's registered in registry with ApartmentModel set to Both. It goes like this:
- retrieve
IWebBrowser2.Document - obtain
IDocumentSelectorinterface from document object - invoke
IDocumentSelector.querySelectorAll("*")which yields aIHTMLDOMChildrenCollectionreference - get
IHTMLDOMChildrenCollection.length - run the for-loop in
0..lengthrange (for(int index = 0; index < totalCount; index++)), - inside loop iteration obtain each collection item using
IHTMLDOMChildrenCollection.item(), - cast the collection item reference to
IHTMLElement2, - obtain
IHTMLElement2.getClientBoundingRect()
and it works rather fine, a page with about 1500 elements gets traversed in 200-300 milliseconds (loop duration is measured by reading DateTime.UtcNow before and after the loop and getting TotalMilliseconds from the readings difference).
Another implementation in done with Visual C++ and ATL. It does mostly the same as the C# version. CComQIPtr is used in place of casts. The loop is the same. It's also registered with ApartmentModel set to Both.
The C++ implementation traverses the very same page DOM in 40-60 milliseconds. Time is measured by reading GetTickCount() before and after the loop and getting the difference.
Then I exclude the step 8 from inside loop iteration - item is obtained and IHTMLElement2 is obtained from it but getClientBoundingRect() is not invoked. After this change both implementations run in mostly the same time - 40-50 milliseconds.
This looks weird. Why would only getClientBoundingRect() be affected? What's so special in it that it slows down so much?