How can I get this text from h4?

407 Views Asked by At

(Sorry about my english, I'm brazilian)

I'm trying to get the InnerText from a h4 tag using the HtmlAgilityPack, I managed to get that type of value in 3 of 4 tags in the web site that I need. But the last one is the most important and it just returns an empty value.

Is it possible, that the structure of how the website was build requires a different way to get this value?

This is the specific h4 that I'm trying to extract InnetText ("356.386.496,02"):

<h4 class="text-black--opacity-60 fs-20 fs-sm-42 fs-lg-40 w-100 mt-3">
<span class="align-middle fs-12 fs-lg-12 pr-4">R$</span>
"356.386.496,02"
</h4>

I've tried this:

HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(data);

var nodes = htmlDocument.DocumentNode.SelectNodes("//h4[@class='text-black--opacity-60 fs-20 fs-sm-42 fs-lg-40 w-100 mt-3']");

foreach (var node in nodes)
{
    Console.WriteLine(node.InnerText);
}
//Result in console:
//=> 

Note that the SelectNodes method doesn't return null, it find the h4 node perfectly, but the InnerText value is "".

1

There are 1 best solutions below

6
Alaaeddine HFIDHI On BEST ANSWER

try to replace "356.386.496,02" with 356.386.496,02 or with ""356.386.496,02""
this solution should be work

public static void Main()
    {
        var html = 
        @"<h4 class=""text-black--opacity-60 fs-20 fs-sm-42 fs-lg-40 w-100 mt-3"">
<span class=""align-middle fs-12 fs-lg-12 pr-4"">R$</span>
""56.386.496,02""
</h4>";

        var htmlDoc = new HtmlDocument();
        htmlDoc.LoadHtml(html);

        var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//h4[@class='text-black--opacity-60 fs-20 fs-sm-42 fs-lg-40 w-100 mt-3']");

        foreach (var node in htmlNodes)
        {

            Console.WriteLine(node.InnerText);
        }
    }