How to write regex to stop replace   character between span or div tags?

1.6k Views Asked by At

I want to change my below regular expression in such a way that It should not replace/remove   if there is   found in aspx page. then skip to replace with blank character

Below expression work fine but only issue is that it removing all the   character.

In my aspx code I wrote <span class='clscode'>&nbsp;</span> in this type of tag inner text I wrote &nbsp; character.

Here is my C# code.

using System;
using System.Data;
using System.Configuration;
using System.Collections;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.HtmlControls;
using System.Linq;
using System.Text.RegularExpressions;
public partial class TestPage : System.Web.UI.Page
{
 protected void Page_Load(object sender, EventArgs e)
 {
 /// my code 
 }

    private static readonly Regex t = new Regex(@">\s+<", RegexOptions.Compiled);
    private static readonly Regex lb = new Regex(@"\n\s+", RegexOptions.Compiled);
    protected override void Render(HtmlTextWriter writer)
    {
        using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
        {
            base.Render(htmlwriter);
            string html = htmlwriter.InnerWriter.ToString();
            html = t.Replace(html, "> <");
            html = lb.Replace(html, string.Empty);
            writer.Write(html.Trim());
        }
    }
}

I need below type output. for example: my page having so many This is a test example

<div id="dvtest"> <space> <space> <space>
<span>&nbsp;</span><space> <space>
<div id='test2'> sample &nbsp;&nbsp;text&nbsp;    </div></div>

//... like this tags. I need output like this.

<div id="dvtest"><span>&nbsp;</span><div id='test2'>sample &nbsp;&nbsp;text&nbsp;</div></div>

Note: Here <space> means white space invisible character

2

There are 2 best solutions below

4
On

Try like this

If you can't use an HTML parser oriented solution to filter out the tags, here's a simple regex for it.

string noHTML = Regex.Replace(inputHTML, @"\n$", "").Trim();

Regex Demo

3
On
protected override void Render(HtmlTextWriter writer)
{
    using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
    {
        string re1 = "( )";
        base.Render(htmlwriter);
        string html = htmlwriter.InnerWriter.ToString();
        Regex r = new Regex(re1, RegexOptions.IgnoreCase | RegexOptions.Singleline);
        Match m = r.Match(html);
        if (m.Success)
        {
            String c1 = m.Groups[1].ToString();

            html = html.Replace(c1.ToString(), "");

            writer.Write(html);

            // Console.Write("(" + c1.ToString() + ")" + "\n");
        }


    }
}