C# Beginner: Delete ALL between two characters in a string (Regex?)

C# Beginner: Delete ALL between two characters in a string (Regex?) - c#

i have a string with an html code. i want to remove all html tags. so all characters between < and >.
This is my code snipped:
WebClient wClient = new WebClient();
SourceCode = wClient.DownloadString( txtSourceURL.Text );
txtSourceCode.Text = SourceCode;
//remove here all between "<" and ">"
txtSourceCodeFormatted.Text = SourceCode;
hope somebody can help me

Try this:
txtSourceCodeFormatted.Text = Regex.Replace(SourceCode, "<.*?>", string.Empty);
But, as others have mentioned, handle with care.

According to Ravi's answer, you can use
string noHTML = Regex.Replace(inputHTML, #"<[^>]+>| ", "").Trim();
or
string noHTMLNormalised = Regex.Replace(noHTML, #"\s{2,}", " ");

Related

Remove a substring after a substring

I have a string, for example
"blabla{code}<br />blabla{code}<br />bla{code}<br />"
How to remove all entries of <br /> after {code}?
I've tried this:
public string removeBR(string comment)
{
Regex codeRegex = new Regex("{code}<br />", RegexOptions.Singleline);
return codeRegex.Replace(comment, new MatchEvaluator(m =>
{
string value = m.Groups[0].Value;
return value.Remove(value.Length - 6);
}));
}
It works but is there any easier way?

Hope this helps you
string input = "blabla{code}<br />blabla{code}<br />bla{code}<br />";
string output = input.Replace("{code}<br />", "{code}");
Console.WriteLine(output);
as String.Replace returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.

A simple string.Replace should solve this.
string input = #"blabla{code}<br />blabla{code}<br />bla{code}<br />";
string result = input.Replace("{code}<br />", "{code}");

How to Avoid the string from number(for rupees) in c#?

am new in c# so how to replace the string
for example:-
label1.Content = "Bal.Rs." + 100;
How to get the 100 only while we save from label1.Text???

Here you go:
string OnlyNumbered = Regex.Match(label1.Content.ToString(), #"\d+").Value;

try doing this
string input="abc 123"
string result = Regex.Replace(input, #"[^\d]", "");
//output result=123
^\d specify not number
Hope this will help

How to correctly us .replace function?

Trying to use a .replace function to take out the breaks (<br> and </br>) though having trouble with the code I have been given.
Here is what I am working with.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader sr = new StreamReader(response.GetResponseStream());
var rawStream = sr.ReadToEnd();
sr.Close();
var myBuilder = new StringBuilder();
myBuilder.AppendLine("Employee Name : " + rawStream.Between("<span id=\"Label_DisplayFullName\">", "</span>"));
myBuilder.AppendLine("Title: " + rawStream.Between("<span id=\"Label_Title\">", "</span>"));
myBuilder.AppendLine("Location : " + rawStream.Between("<span id=\"Label_Location\">", "</span>"));
myBuilder.AppendLine("Department : " + rawStream.Between("<span id=\"Label_Department\">", "</span>"));
myBuilder.AppendLine("Group: " + rawStream.Between("<span id=\"Label_Group\">", "</span>"));
myBuilder.AppendLine("Office Phone : " + rawStream.Between("<span id=\"Label_IntPhoneNumber\">", "</span>"));
myBuilder.AppendLine("Mobile Phone : " + rawStream.Between("<span id=\"Label_BusMobile\">", "</span>"));
richTextBox1.Text = myBuilder.ToString();
Though I understand the function should be like so:
public string Replace( string oldValue, string newValue )
I just dont understand how this works in my code as I dont really have a "string" but I have a string builder.
Any assistance would be huge.

What does all your StringBuilder stuff have to do with removing the breaks? Your string is in the rawStream variable (quite badly named) (that's what ReadToEnd() gives you), so you'd just:
rawStream = rawStream.Replace("<br>", "");
rawStream = rawStream.Replace("<br />");

There are two things you could do:
On the one hand, you do assemble your string with a StringBuilder, however, you eventually convert the contents of that string builder into a string when you call myBuilder.ToString(). That is where you could invoke Replace:
richTextBox1.Text = myBuilder.ToString().Replace("<br>", "").Replace("</br>", "");
Alternatively, StringBuilder has a Replace method of its own, so you could invoke that before transforming the string builder contents into a string:
myBuilder.Replace("<br>", "");
myBuilder.Replace("</br>", "");
Note that the latter can alternatively be invoked in a chained fashion, as well, though that is arguably less readable:
richTextBox1.Text = myBuilder.Replace("<br>", "").Replace("</br>", "").ToString();

Since replace returns string you can chain them
rawStream = rawStream.Replace("<br>","").Replace("<br/>","").Replace("<br />","");

Can't you just put it after the ToString() call?
richTextBox1.Text = myBuilder.ToString().Replace("string1", "string2");
As mentioned in the comments you can also call it on the stringbuilder object itself.
richTextBox1.Text = myBuilder.Replace("string1", "string2").ToString();

Try this.
richTextBox1.Text = myBuilder.Replace("<br>", String.Empty).Replace("</br>", String.Empty).ToString();
Or
var rawStream = sr.ReadToEnd().Replace("<br>", String.Empty).Replace("</br>", String.Empty);

Facebook feed - remove extra Facebook JS from anchor

Please help me to replace all the additional Facebook information from here using C# .net Regex Replace method.
Example
http://on.fb.me/OE6gnBsomehtml
Output
somehtml on.fb.me/OE6gnB somehtml
I tried following regex but they didn't work for me
searchPattern = "<a([.]*)?/l.php([.]*)?(\">)?([.]*)?(</a>)?";
replacePattern = "$3";
Thanks

I manage to do this using regex with following code
searchPattern = "<a(.*?)href=\"/l.php...(.*?)&?(.*?)>(.*?)</a>";
string html1 = Regex.Replace(html, searchPattern, delegate(Match oMatch)
{
return string.Format("{1}", HttpUtility.UrlDecode(oMatch.Groups[2].Value), oMatch.Groups[4].Value);
});

You can try this (System.Web has to be added to use System.Web.HttpUtility):
string input = #"http://on.fb.me/OE6gnBsomehtml";
string rootedInput = String.Format("<root>{0}</root>", input);
XDocument doc = XDocument.Parse(rootedInput, LoadOptions.PreserveWhitespace);
string href;
var anchors = doc.Descendants("a").ToArray();
for (int i = anchors.Count() - 1; i >= 0; i--)
{
href = HttpUtility.ParseQueryString(anchors[i].Attribute("href").Value)[0];
XElement newAnchor = new XElement("a");
newAnchor.SetAttributeValue("href", href);
newAnchor.SetValue(href.Replace(#"http://", String.Empty));
anchors[i].ReplaceWith(newAnchor);
}
string output = doc.Root.ToString(SaveOptions.DisableFormatting)
.Replace("<root>", String.Empty)
.Replace("</root>", String.Empty);

C# Regexp change link format

On my forum I have a lot of redundant link data like:
[url:30l7ypk7]http://www.box.net/shared/0p28sf6hib[/url:30l7ypk7]
In regexp how can I change these to the format:
http://www.box.net/shared/0p28sf6hib

string orig = "[url:30l7ypk7]http://www.box.net/shared/0p28sf6hib[/url:30l7ypk7]";
string replace = "$1";
string regex = #"\[url:.*?](.*?)\[/url:.*?]";
string fixedLink = Regex.Replace(orig, regex, replace);

This isn't doing it totally in Regex but will still work...
string oldUrl = "[url:30l7ypk7]http://www.box.net/shared/0p28sf6hib[/url:30l7ypk7]";
Regex regExp = new Regex(#"http://[^\[]*");
var match = regExp.Match(oldUrl);
string newUrl = string.Format("<a href='{0}' rel='nofollow'>{0}</a>", match.Value);

This should capture the string \[([^\]]+)\]([^[]+)\[/\1\] and group it so you can pull out the URL like this:
Regex re = new Regex(#"\[([^\]]+)\]([^[]+)\[/\1\]");
var s = #"[url:30l7ypk7]http://www.box.net/shared/0p28sf6hib[/url:30l7ypk7]";
var replaced = s.Replace(s, string.Format("{0}", re.Match(s).Groups[1].Value));
Console.WriteLine(replaced)

This is just from memory but I will try to check it over when I have more time. Should help get you started.
string matchPattern = #"\[(url\:\w)\](.+?)\[/\1\]";
String replacePattern = #"<a href='$2' rel='nofollow'>$2</a>";
String blogText = ...;
blogText = Regex.Replace(matchPattern, blogText, replacePattern);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Beginner: Delete ALL between two characters in a string (Regex?) - c#

Try this: txtSourceCodeFormatted.Text = Regex.Replace(SourceCode, "<.*?>", string.Empty); But, as others have mentioned, handle with care.

According to Ravi's answer, you can use string noHTML = Regex.Replace(inputHTML, #"<[^>]+>| ", "").Trim(); or string noHTMLNormalised = Regex.Replace(noHTML, #"\s{2,}", " ");

Related

Remove a substring after a substring

How to Avoid the string from number(for rupees) in c#?

How to correctly us .replace function?

Facebook feed - remove extra Facebook JS from anchor

C# Regexp change link format

Categories

Resources