Efficient method to increase "code" by 1 - HtmlAgilityPack

78 Views Asked by At

I'm working on an app that extracts content from a game page (example), displays it to the user in a textbox and if the user wishes to do so, he/she can save it as a .txt file or .xsl (excel spreadsheet format).

But the main problem I'm facing right now is that you have to manually change the code to "extract" data about another in-game unit.

If you open the link you'll see that I'm currently extracting the "Weapons", "Used", "Survived" and "Casualties" from the Defender side (as for now), but only 1 type of unit (more like only 1 row of that table) is being "extracted", I'm looking for a way to search "tr[1]/td[2]/span[1]" through "tr[45]/td[2]/span[1]" (even if the example page only goes until tr[16]), or maybe a way to automate it to search until it finds no data (nothing) then it would stop.

Sorry for any text mistakes, I'm not a native speaker

private void btnStart_Click(object sender, RoutedEventArgs e)
    {
        HtmlDocument brPage = new HtmlWeb().Load("http://us.desert-operations.com/world2/battleReport.php?code=f8d77b1328c8ce09ec398a78505fc465");
        HtmlNodeCollection nodes = brPage.DocumentNode.SelectNodes("/html[1]/body[1]/div[1]/div[1]/div[3]/div[1]/div[1]/div[1]/div[2]/table[2]");
        string result = "";
        List<brContentSaver> ContentList = new List<brContentSaver>();
        foreach (var item in nodes)
        {
            brContentSaver cL = new brContentSaver();
            /*  Here comes the junk handler, replaces all junk for nothing, essentially deleting it
                I wish I knew a way to do this efficiently  */
            cL.Weapons = item.SelectSingleNode("tr[16]/td[1]").InnerText
                .Replace("&nbsp;*&nbsp;", " ")
                .Replace("&nbsp ; *&nbsp ;", " ");

            cL.Used = item.SelectSingleNode("tr[16]/td[2]/span[1]").InnerText
                .Replace("&nbsp;*&nbsp;", " ")
                .Replace("&nbsp ; *&nbsp ;", " ");

            cL.Survived = item.SelectSingleNode("tr[16]/td[3]").InnerText
                .Replace("&nbsp;*&nbsp;", " ")
                .Replace("&nbsp ; *&nbsp ;", " ");

            if (cL.Survived == "0")
            {
                cL.Casualties = cL.Used;
            } else
            {
                /*  int Casualties = int.Parse(cL.Casualties);
                 *  int Used = int.Parse(cL.Used);
                 *  int Survived = int.Parse(cL.Survived);

                 *  Casualties = Used - Survived;   */

                 cL.Casualties = item.SelectSingleNode("tr[16]/td[4]").InnerText
                 .Replace("&nbsp;*&nbsp;", " ")
                 .Replace("&nbsp ; *&nbsp ;", " ");
            }

            ContentList.Add(cL);
        }

        foreach (var item in ContentList)
        {
            result += item.Weapons + " " + item.Used + " " + item.Survived + " " + item.Casualties + Environment.NewLine;
        }
        brContent.Text = result;

    }

Sorry if this sounds silly, but I'm new to programming, especially in C#.

Edit 1: I noticed that "if (cL.Survived == "0")", I was just testing stuff some stuff way earlier and I forgot to change it, but hey, it works

Edit 2: If you are wondering I'm also using this:

public class brContentSaver
{

    public string Weapons
    {
        get;
        set;
    }

    public string Used
    {
        get;
        set;
    }

    public string Survived
    {
        get;
        set;
    }
    public string Casualties
    {
        get;
        set;
    }
}
1

There are 1 best solutions below

1
On BEST ANSWER

I don't have much time to write this but hope it will help if you still need. I find Linq is more handy:

private static void Run()
{
    HtmlDocument brPage = new HtmlWeb().Load("http://us.desert-operations.com/world2/battleReport.php?code=f8d77b1328c8ce09ec398a78505fc465");
    var nodes = brPage.DocumentNode.Descendants("table").Where(_ => _.Attributes["class"] != null && _.Attributes["class"].Value != null && _.Attributes["class"].Value.Contains("battleReport"));
    string result = "";
    List<brContentSaver> ContentList = new List<brContentSaver>();
    foreach (var item in nodes)
    {
        if (item.Descendants("th").Any(_ => _.InnerText.Equals("Weapons")))
        {
            //get all tr nodes except first one (header)
            var trNodes = item.Descendants("tr").Skip(1);
            foreach (var node in trNodes)
            {
                brContentSaver cL = new brContentSaver();
                var tds = node.Descendants("td").ToArray();
                /*  Here comes the junk handler, replaces all junk for nothing, essentially deleting it
                    I wish I knew a way to do this efficiently  */
                cL.Weapons = tds[0].InnerText
                    .Replace("&nbsp;*&nbsp;", " ")
                    .Replace("&nbsp ; *&nbsp ;", " ");

                cL.Used = tds[1].Descendants("span").FirstOrDefault()?.InnerText
                    .Replace("&nbsp;*&nbsp;", " ")
                    .Replace("&nbsp ; *&nbsp ;", " ");
                if (string.IsNullOrEmpty(cL.Used))
                {
                    cL.Used = tds[1].InnerText;
                }

                cL.Survived = tds[2].Descendants("span").FirstOrDefault()?.InnerText
                    .Replace("&nbsp;*&nbsp;", " ")
                    .Replace("&nbsp ; *&nbsp ;", " ");

                if (string.IsNullOrEmpty(cL.Survived))
                {
                    cL.Casualties = cL.Used;
                }
                else
                {
                    /*  int Casualties = int.Parse(cL.Casualties);
                     *  int Used = int.Parse(cL.Used);
                     *  int Survived = int.Parse(cL.Survived);

                     *  Casualties = Used - Survived;   */

                    cL.Casualties = tds[3].Descendants("span").FirstOrDefault()?.InnerText
                    .Replace("&nbsp;*&nbsp;", " ")
                    .Replace("&nbsp ; *&nbsp ;", " ");

                    if (string.IsNullOrEmpty(cL.Casualties))
                    {
                        cL.Casualties = tds[3].InnerText;
                    }
                }

                ContentList.Add(cL);
            }
        }
    }

    foreach (var item in ContentList)
    {
        result += item.Weapons + " " + item.Used + " " + item.Survived + " " + item.Casualties + Environment.NewLine;
    }
    var text = result;

}