Generic xpath to access particular tab content if that exists

677 Views Asked by At

Below are the two web pages having tabs like "Features,Application and Benefits",here I want to extract the content of only "Features" tab. One webpage having "Features" in first tab and other webpage have "Benefits" instead of "Features" tab.

http://www.eaton.com/Eaton/ProductsServices/Hydraulics/Accumulators/PCT_256248 http://www.eaton.com/Eaton/ProductsServices/Vehicle/Superchargers/RSeries/index.htm#tabs-2

Tried Method: By using "below code" and the xpath("//a[span='Features']/../../../div/div") I am able to get content of all tabs which are present in the web page.But,my problem is I am looking for generic "xpath" that should get content of only "Features" in a webpage and it should not display anything if "Features" tab is not present.

 HtmlCleaner htmCleaner = new HtmlCleaner();
   String s = "http://www.eaton.com/Eaton/ProductsServices/Hydraulics/Accumulators/PCT_256248";
   Document doc =  Jsoup.connect(s).timeout(30000).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2").get();
   String pageContent=doc.toString();
   TagNode node = htmCleaner.clean(pageContent);
   Object[] statsNode = node.evaluateXPath("//a[span='Features']/../../../div/div");
   for(int i=0;i<statsNode.length;i++){
   TagNode resultNode = (TagNode) statsNode[i];
   System.out.print(resultNode.getText());
   }
1

There are 1 best solutions below

3
On

Notice that the target div id corresponds to the href attribute of the tab header. For example, when the href attribute value is "#tabs-1", the corresponding div id attribute value is "tabs-1".

Taking advantage of that correlation, this is one possible XPath that will return <div> element that corresponds to Features link/tab or return nothing in absence of Features tab :

//div[concat('#', @id)=preceding::a[span='Features']/@href]