. How would I use regex to allow me to catch only the player name? This..." /> . How would I use regex to allow me to catch only the player name? This..." /> . How would I use regex to allow me to catch only the player name? This..."/>

Regex from a html parsing, how do I grab a specific string?

135 Views Asked by At

I'm trying to specifically get the string after charactername= and before " >. How would I use regex to allow me to catch only the player name?

This is what I have so far, and it's not working. Not working as it doesn't actually print anything. On the client.DownloadString it returns a string like this:

<a href="https://my.examplegame.com/charactername=Atro+Roter" >

So, I know it actually gets string, I'm just stuck on the regex.

using (var client = new WebClient())
        {

            //Example of what the string looks like on Console when I Console.WriteLine(html)
            //<a href="https://my.examplegame.com/charactername=Atro+Roter" >

            // I want the "Atro+Roter"

            string html = client.DownloadString(worldDest + world + inOrderName);
            string playerName = "https://my.examplegame.com/charactername=(.+?)\" >";

            MatchCollection m1 = Regex.Matches(html, playerName);


            foreach (Match m in m1)
            {
                Console.WriteLine(m.Groups[1].Value);
            }
        }
3

There are 3 best solutions below

0
On BEST ANSWER

I'm trying to specifically get the string after charactername= and before " >. 

So, you just need a lookbehind with lookahead and use LINQ to get all the match values into a list:

var input = "your input string";
var rx = new Regex(@"(?<=charactername=)[^""]+(?="")";
var res = rx.Matches(input).Cast<Match>().Select(p => p.Value).ToList();

The res variable should hold all your character names now.

3
On

I assume your issue is trying to parse the URL. Don't - use what .NET gives you:

var playerName = "https://my.examplegame.com/?charactername=NAME_HERE";
var uri = new Uri(playerName);
var queryString = HttpUtility.ParseQueryString(uri.Query);

Console.WriteLine("Name is: " + queryString["charactername"]);

This is much easier to read and no doubt more performant.

Working sample here: https://dotnetfiddle.net/iJlBKW

2
On

All forward slashes must be unescaped with back slashes like this \/

string input = @"<a href=""https://my.examplegame.com/charactername=Atro+Roter"" >";
 string playerName = @"https:\/\/my.examplegame.com\/charactername=(.+?)""";

 Match match = Regex.Match(input, playerName);
 string result = match.Groups[1].Value;

Result = Atro+Roter