C# Xpath not correct when the xpath contains text()

233 Views Asked by At

Let me keep it simple i was trying to get the release date. when i get the xpath i get a text() on the end it just doesn't work. here is a what i am trying to get.

This is just part of the website code i am pasting the whole code is 5000 lines. here is a link http://www.imdb.com/title/tt2561572/?ref_=nm_knf_t3 if you right click on the release date inspect it and then copy the xpath it does not work in c#.

<span class="ghost">|</span>
<a href="/title/tt2561572/releaseinfo?ref_=tt_ov_inf"
title="See more release dates" >7 May 2015 (Netherlands)
<meta itemprop="datePublished" content="2015-05-07" />

I know the format is not good but best is if you take any movie on IMDB just the release date that is what i am trying to get. the release date 7 May 2015. and the meta content attribute. i cant seem to find out why it wont happen here is my code.

this was my first try it did not work. it sees the node but when i add Text() it just does not work

            // Loading and getting the document
            HtmlAgilityPack.HtmlDocument doc = base.Document;

            // Getting the node
            HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[@id=\"title - overview - widget\"]/div[2]/div[2]/div/div[2]/di‌​v[2]/div/a[3]/text()");

            // Retuning the text of the node
            return node.InnerText;

then i started trying to get the content values out of the meta. i also want also did not work

// Getting the node
HtmlNode node = doc.DocumentNode.SelectSingleNode("//*[@id=\"title - overview - widget\"]/div[2]/div[2]/div/div[2]/div[2]/div/a[4]/meta");

string date = node.Attributes["content"].Value;

This is when i was tried to get the Meta line. but when you get the 7 may 2015 xpath it ends with text() and it just does not work. And i know i am posting a lot sorry for that.

1

There are 1 best solutions below

0
On

Got it thanks that no one helped. the xpath sould be //div[@class='subtext']//meta[@itemprop=\"datePublished\"]"