I want to replace all the word that start via # with another word, here is my code:
public string SemiFinalText { get; set; }
public string FinalText { get; set; }
//sample text : "aaaa bbbb #cccc dddd #eee fff g"
public string GetProperText(string text)
{
if (text.Contains('#'))
{
int index = text.IndexOf('#');
string restText = text.Substring(index);
var indexLast = restText.IndexOf(' ');
var oldName = text.Substring(index, indexLast);
string restText2 = text.Substring( index + indexLast);
SemiFinalText += text.Substring(0, index + indexLast).Replace(oldName, "#New");
if (restText2.Contains('#'))
{
GetProperText(restText2);
}
FinalText = SemiFinalText + restText2;
return FinalText;
}
else
{
return text;
}
}
When return FinalText; is executed I want to stop recursive function. How can fix it?
Maybe another approach is better than recursive function. If you know another way please give an answer to me.
You don't need a recursive solution for this problem. You have a string containing a number of words (separated by spaces) and you want to replace the ones starting with an '#' with another string. Modifying your solution to have a simple method that splits based on spaces, replaces all words starting with # and then combines them once again.
Using Linq:
string text = "aaaa bbbb #cccc dddd #eee fff g";
FinalText = GetProperText(text, "New");
public string GetProperText(string text, string replacewith)
{
text = string.Join(" ", text.Split(' ').Select(x => x.StartsWith("#") ? replacewith: x));
return text;
}
Output: aaaa bbbb New dddd New fff g
Using Regex:
Regex rgx = new Regex("#([^ #])*");
string result = rgx.Replace(text, replaceword);
Solution with Regular Expressions:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string pattern = #"#\w+";
var r = new Regex(pattern);
Console.WriteLine(r.Replace("ABC #ABC ABC #DEF klm.#bhsh", "BOOM!"));
}
}
This does not rely on space character being the delimiter, any non-word (letters and numbers) can be used to separate the 'words'. This example outputs:
ABC BOOM! ABC BOOM! klm.BOOM!
You can test it out here: https://dotnetfiddle.net/rZyjjg
If you're new to Regex: .NET Introduction to Regular Expressions
Here also the proper way to do it recursively for anyone interested. I think your stopping condition was actually oke, but you should concatenate the outcome of the recursive function call to the already processed text. Also I think that using global variables in a recursive function defeats its purpose a little bit.
That being said I think that using RegEx from one of the supplied answer is better and faster.
The recursive code:
//sample text : "aaaa bbbb #cccc dddd #eee fff g"
public string GetProperText(string text)
{
if (text.Contains('#'))
{
int index = text.IndexOf('#'); //Index of first occuring '#'
var indexLast = text.IndexOf(' ',index); //Index of first ' ' after '#'
var oldName = text.Substring(index, indexLast); //Old Name
string processedText = text.Substring(0, index + indexLast).Replace(oldName, "New"); //String with new name
string restText = text.Substring(indexLast); //Rest Text
if (text.Contains('#'))
{
//Here the outcome of the function is pasted on the allready processed text part.
text = processedText + GetProperText(restText);
}
return text;
}
else
{
return text;
}
}
Related
I have a big String in my program.
For Example:
String Newspaper = "...Blablabla... What do you like?...Blablabla... ";
Now I want to cut out the "What do you like?" an write it to a new String. But the problem is that the "Blablabla" is everytime something diffrent. Whit "cut out" I mean that you submit a start and a end word and all the things wrote between these lines should be in the new string. Because the sentence "What do you like?" changes sometimes except the start word "What" and the end word "like?"
Thanks for every responds
You can write the following method:
public static string CutOut(string s, string start, string end)
{
int startIndex = s.IndexOf(start);
if (startIndex == -1) {
return null;
}
int endIndex = s.IndexOf(end, startIndex);
if (endIndex == -1) {
return null;
}
return s.Substring(startIndex, endIndex - startIndex + end.Length);
}
It returns null if either the start or end pattern is not found. Only end patterns that follow the start pattern are searched for.
If you are working with C# 8+ and .NET Core 3.0+, you can also replace the last line with
return s[startIndex..(endIndex + end.Length)];
Test:
string input = "...Blablabla... What do you like?...Blablabla... ";
Console.WriteLine(CutOut(input, "What ", " like?"));
prints:
What do you like?
If you are happy with Regex, you can also write:
public static string CutOutRegex(string s, string start, string end)
{
Match match = Regex.Match(s, $#"\b{Regex.Escape(start)}.*{Regex.Escape(end)}");
if (match.Success) {
return match.Value;
}
return null;
}
The \b ensures that the start pattern is only found at the beginning of a word. You can drop it if you want. Also, if the end pattern occurs more than once, the result will include all of them unlike the first example with IndexOf which will only include the first one.
You have to do a substring, like the example below. See source for more information on substrings.
// A long string
string bio = "Mahesh Chand is a founder of C# Corner. Mahesh is also an
author, speaker, and software architect. Mahesh founded C# Corner in
2000.";
// Get first 12 characters substring from a string
string authorName = bio.Substring(0, 12);
Console.WriteLine(authorName);
In this case I would do it like this, cut the first part and then the second and concatenate with the fixed words using them as a parameter for cutting.
public string CutPhrase(string phrase)
{
var fst = "What";
var snd = "like?";
string[] cut1 = phrase.Split(new[] { fst }, StringSplitOptions.None);
string[] cut2 = cut1[1].Split(new[] { snd }, StringSplitOptions.None);
var rst = $"{fst} {cut2[0]} {snd}";
return rst;
}
I'm having issues doing a find / replace type of action in my function, i'm extracting the < a href="link">anchor from an article and replacing it with this format: [link anchor] the link and anchor will be dynamic so i can't hard code the values, what i have so far is:
public static string GetAndFixAnchor(string articleBody, string articleWikiCheck) {
string theString = string.Empty;
switch (articleWikiCheck) {
case "id|wpTextbox1":
StringBuilder newHtml = new StringBuilder(articleBody);
Regex r = new Regex(#"\<a href=\""([^\""]+)\"">([^<]+)");
string final = string.Empty;
foreach (var match in r.Matches(theString).Cast<Match>().OrderByDescending(m => m.Index))
{
string text = match.Groups[2].Value;
string newHref = "[" + match.Groups[1].Index + " " + match.Groups[1].Index + "]";
newHtml.Remove(match.Groups[1].Index, match.Groups[1].Length);
newHtml.Insert(match.Groups[1].Index, newHref);
}
theString = newHtml.ToString();
break;
default:
theString = articleBody;
break;
}
Helpers.ReturnMessage(theString);
return theString;
}
Currently, it just returns the article as it originally is, with the traditional anchor text format: < a href="link">anchor
Can anyone see what i have done wrong?
regards
If your input is HTML, you should consider using a corresponding parser, HtmlAgilityPack being really helpful.
As for the current code, it looks too verbose. You may use a single Regex.Replace to perform the search and replace in one pass:
public static string GetAndFixAnchor(string articleBody, string articleWikiCheck) {
if (articleWikiCheck == "id|wpTextbox1")
{
return Regex.Replace(articleBody, #"<a\s+href=""([^""]+)"">([^<]+)", "[$1 $2]");
}
else
{
// Helpers.ReturnMessage(articleBody); // Uncomment if it is necessary
return articleBody;
}
}
See the regex demo.
The <a\s+href="([^"]+)">([^<]+) regex matches <a, 1 or more whitespaces, href=", then captures into Group 1 any one or more chars other than ", then matches "> and then captures into Group 2 any one or more chars other than <.
The [$1 $2] replacement replaces the matched text with [, Group 1 contents, space, Group 2 contents and a ].
Updated (Corrected regex to support whitespaces and new lines)
You can try this expression
Regex r = new Regex(#"<[\s\n]*a[\s\n]*(([^\s]+\s*[ ]*=*[ ]*[\s|\n*]*('|"").*\3)[\s\n]*)*href[ ]*=[ ]*('|"")(?<link>.*)\4[.\n]*>(?<anchor>[\s\S]*?)[\s\n]*<\/[\s\n]*a>");
It will match your anchors, even if they are splitted into multiple lines. The reason why it is so long is because it supports empty whitespaces between the tags and their values, and C# does not supports subroutines, so this part [\s\n]* has to be repeated multiple times.
You can see a working sample at dotnetfiddle
You can use it in your example like this.
public static string GetAndFixAnchor(string articleBody, string articleWikiCheck) {
if (articleWikiCheck == "id|wpTextbox1")
{
return Regex.Replace(articleBody,
#"<[\s\n]*a[\s\n]*(([^\s]+\s*[ ]*=*[ ]*[\s|\n*]*('|"").*\3)[\s\n]*)*href[ ]*=[ ]*('|"")(?<link>.*)\4[.\n]*>(?<anchor>[\s\S]*?)[\s\n]*<\/[\s\n]*a>",
"[${link} ${anchor}]");
}
else
{
return articleBody;
}
}
I got bunch of strings in text, which looks like something like this:
h1. this is the Header
h3. this one the header too
h111. and this
And I got function, which suppose to process this text depends on what lets say iteration it been called
public void ProcessHeadersInText(string inputText, int atLevel = 1)
so the output should look like one below in case of been called
ProcessHeadersInText(inputText, 2)
Output should be:
<h3>this is the Header<h3>
<h5>this one the header too<h5>
<h9 and this <h9>
(last one looks like this because of if value after h letter is more than 9 it suppose to be 9 in the output)
So, I started to think about using regex.
Here's the example https://regex101.com/r/spb3Af/1/
(As you can see I came up with regex like this (^(h([\d]+)\.+?)(.+?)$) and tried to use substitution on it <h$3>$4</h$3>)
Its almost what I'm looking for but I need to add some logic into work with heading level.
Is it possible to add any work with variables in substitution?
Or I need to find other way? (extract all heading first, replace em considering function variables and value of the header, and only after use regex I wrote?)
The regex you may use is
^h(\d+)\.+\s*(.+)
If you need to make sure the match does not span across line, you may replace \s with [^\S\r\n]. See the regex demo.
When replacing inside C#, parse Group 1 value to int and increment the value inside a match evaluator inside Regex.Replace method.
Here is the example code that will help you:
using System;
using System.Linq;
using System.Text.RegularExpressions;
using System.IO;
public class Test
{
// Demo: https://regex101.com/r/M9iGUO/2
public static readonly Regex reg = new Regex(#"^h(\d+)\.+\s*(.+)", RegexOptions.Compiled | RegexOptions.Multiline);
public static void Main()
{
var inputText = "h1. Topic 1\r\nblah blah blah, because of bla bla bla\r\nh2. PartA\r\nblah blah blah\r\nh3. Part a\r\nblah blah blah\r\nh2. Part B\r\nblah blah blah\r\nh1. Topic 2\r\nand its cuz blah blah\r\nFIN";
var res = ProcessHeadersInText(inputText, 2);
Console.WriteLine(res);
}
public static string ProcessHeadersInText(string inputText, int atLevel = 1)
{
return reg.Replace(inputText, m =>
string.Format("<h{0}>{1}</h{0}>", (int.Parse(m.Groups[1].Value) > 9 ?
9 : int.Parse(m.Groups[1].Value) + atLevel), m.Groups[2].Value.Trim()));
}
}
See the C# online demo
Note I am using .Trim() on m.Groups[2].Value as . matches \r. You may use TrimEnd('\r') to get rid of this char.
You can use a Regex like the one used below to fix your issues.
Regex.Replace(s, #"^(h\d+)\.(.*)$", #"<$1>$2<$1>", RegexOptions.Multiline)
Let me explain you what I am doing
// This will capture the header number which is followed
// by a '.' but ignore the . in the capture
(h\d+)\.
// This will capture the remaining of the string till the end
// of the line (see the multi-line regex option being used)
(.*)$
The parenthesis will capture it into variables that can be used as "$1" for the first capture and "$2" for the second capture
Try this:
private static string ProcessHeadersInText(string inputText, int atLevel = 1)
{
// Group 1 = value after 'h'
// Group 2 = Content of header without leading whitespace
string pattern = #"^h(\d+)\.\s*(.*?)\r?$";
return Regex.Replace(inputText, pattern, match => EvaluateHeaderMatch(match, atLevel), RegexOptions.Multiline);
}
private static string EvaluateHeaderMatch(Match m, int atLevel)
{
int hVal = int.Parse(m.Groups[1].Value) + atLevel;
if (hVal > 9) { hVal = 9; }
return $"<h{hVal}>{m.Groups[2].Value}</h{hVal}>";
}
Then just call
ProcessHeadersInText(input, 2);
This uses the Regex.Replace(string, string, MatchEvaluator, RegexOptions) overload with a custom evaluator function.
You could of course streamline this solution into a single function with an inline lambda expression:
public static string ProcessHeadersInText(string inputText, int atLevel = 1)
{
string pattern = #"^h(\d+)\.\s*(.*?)\r?$";
return Regex.Replace(inputText, pattern,
match =>
{
int hVal = int.Parse(match.Groups[1].Value) + atLevel;
if (hVal > 9) { hVal = 9; }
return $"<h{hVal}>{match.Groups[2].Value}</h{hVal}>";
},
RegexOptions.Multiline);
}
A lot of good solution in this thread, but I don't think you really need a Regex solution for your problem. For fun and challenge, here a non regex solution:
Try it online!
using System;
using System.Linq;
public class Program
{
public static void Main()
{
string extractTitle(string x) => x.Substring(x.IndexOf(". ") + 2);
string extractNumber(string x) => x.Remove(x.IndexOf(". ")).Substring(1);
string build(string n, string t) => $"<h{n}>{t}</h{n}>";
var inputs = new [] {
"h1. this is the Header",
"h3. this one the header too",
"h111. and this" };
foreach (var line in inputs.Select(x => build(extractNumber(x), extractTitle(x))))
{
Console.WriteLine(line);
}
}
}
I use C#7 nested function and C#6 interpolated string. If you want, I can use more legacy C#. The code should be easy to read, I can add comments if needed.
C#5 version
using System;
using System.Linq;
public class Program
{
static string extractTitle(string x)
{
return x.Substring(x.IndexOf(". ") + 2);
}
static string extractNumber(string x)
{
return x.Remove(x.IndexOf(". ")).Substring(1);
}
static string build(string n, string t)
{
return string.Format("<h{0}>{1}</h{0}>", n, t);
}
public static void Main()
{
var inputs = new []{
"h1. this is the Header",
"h3. this one the header too",
"h111. and this"
};
foreach (var line in inputs.Select(x => build(extractNumber(x), extractTitle(x))))
{
Console.WriteLine(line);
}
}
}
I have been trying real hard understanding regular expression, Is there any way I can replace character(s) that is between two regex/ For example I have
string datax = "a4726e1e-babb-4898-a5d5-e29d2bc40028;POPULATE DATA AØ99c1d133-15f5-4ef5-bc59- d9ed673b70c6;POPULATE DATA BØ";
how to remove string between regex ";" and "Ø" ???
i try to use code like this :
string xresult = Regex.Replace(datax, #"(?<=;)(\w+?)(?=Ø)", "");
But not working.
please corrected and give me solutions...
thanks...
i want the result like this sir :
string datax = "a4726e1e-babb-4898-a5d5-e29d2bc40028;Ø99c1d133-15f5-4ef5-bc59-d9ed673b70c6;Ø";
I think you need to understand regex a little better and how the replace function works. with regex you're defining capture groups, and with the replace function you want to replace those groups.
how to remove string between regex ";" and "Ø" ???
Step 1: First find ";",then capture all characters up to and including "Ø".
That's (;.*?Ø)
( New Capture Group
; Match ";"
. Match Anything
* Zero or more times
? Be Lazy
Ø Match "Ø"
) End Capture
Step 2: Replace each group with ";Ø"
public static string Replace(string input, string pattern, string
replacement)
So you need to put back the ";Ø" you removed from the original capture.
static void Test2()
{
foreach (string item in SO2588078())
{
Console.WriteLine(item);
}
string input = "a4726e1e-babb-4898-a5d5-e29d2bc40028;POPULATE DATA AØ99c1d133-15f5-4ef5-bc59- d9ed673b70c6;POPULATE DATA BØ";
string regex = "(;.*?Ø)";
string output = Regex.Replace(input, regex, ";Ø");
if (output == string.Join(";Ø", SO2588078()) + ";Ø")
{
Console.WriteLine("TRUE");
}
}
An alternative would be to parse the string without regex. It's a simple format and this gives you more control over the process so you can see what's happening, why it's gone wrong and why it gives the results it does. Since you can step through it.
private static IEnumerable<string> SO2588078()
{
string datax = "a4726e1e-babb-4898-a5d5-e29d2bc40028;POPULATE DATA AØ99c1d133-15f5-4ef5-bc59- d9ed673b70c6;POPULATE DATA BØ";
string temp = datax;
while (!string.IsNullOrEmpty(temp))
{
int index1 = temp.IndexOf(';');
if (index1 > -1)
{
string guid = temp.Remove(index1);
yield return guid;
int index2 = temp.IndexOf('Ø');
if (index2 > -1)
{
temp = temp.Substring(index2 + 1);
}
else
{
temp = null;
}
}
else
{
temp = null;
}
}
}
Is there some build in method that add quotes around string in c# ?
Do you mean just adding quotes? Like this?
text = "\"" + text + "\"";
? I don't know of a built-in method to do that, but it would be easy to write one if you wanted to:
public static string SurroundWithDoubleQuotes(this string text)
{
return SurroundWith(text, "\"");
}
public static string SurroundWith(this string text, string ends)
{
return ends + text + ends;
}
That way it's a little more general:
text = text.SurroundWithDoubleQuotes();
or
text = text.SurroundWith("'"); // For single quotes
I can't say I've needed to do this often enough to make it worth having a method though...
string quotedString = string.Format("\"{0}\"", originalString);
Yes, using concatenation and escaped characters
myString = "\"" + myString + "\"";
Maybe an extension method
public static string Quoted(this string str)
{
return "\"" + str + "\"";
}
Usage:
var s = "Hello World"
Console.WriteLine(s.Quoted())
No but you can write your own or create an extension method
string AddQuotes(string str)
{
return string.Format("\"{0}\"", str);
}
Using Escape Characters
Just prefix the special character with a backslash, which is known as an escape character.
Simple Examples
string MyString = "Hello";
Response.Write(MyString);
This would print:
Hello
But:
string MyString = "The man said \"Hello\"";
Response.Write(MyString);
Would print:
The man said "Hello"
Alternative
You can use the useful # operator to help escape strings, see this link:
http://www.kowitz.net/archive/2007/03/06/the-c-string-literal
Then, for quotes, you would use double quotes to represent a single quote. For example:
string MyString = #"The man said ""Hello"" and went on his way";
Response.Write(MyString);
Outputs:
The man said "Hello" and went on his way
I'm a bit C# of a novice myself, so have at me, but I have this in a catch-all utility class 'cause I miss Perl:
// overloaded quote - if no quote chars spec'd, use ""
public static string quote(string s) {
return quote(s, "\"\"");
}
// quote a string
// q = two quote chars, like "", '', [], (), {} ...
// or another quoted string (quote-me-like-that)
public static string quote(string s, string q) {
if(q.Length == 0) // no quote chars, use ""
q = "\"\"";
else if(q.Length == 1) // one quote char, double it - your mileage may vary
q = q + q;
else if(q.Length > 2) // longer string == quote-me-like-that
q = q.Substring(0, 1) + q.Substring(q.Length - 1, 1);
if(s.Length == 0) // nothing to quote, return empty quotes
return q;
return q[0] + s + q[1];
}
Use it like this:
quote("this with default");
quote("not recommended to use one char", "/");
quote("in square brackets", "[]");
quote("quote me like that", "{like this?}");
Returns:
"this with default"
/not recommended to use one char/
[in square brackets]
{quote me like that}
In my case I wanted to add quotes only if the string was not already surrounded in quotes, so I did:
(this is slightly different to what I actually did, so it's untested)
public static string SurroundWith(this string text, string ends)
{
if (!(text.StartsWith(ends) && text.EndsWith(ends)))
{
return string.Format("{1}{0}{1}", text, ends);
}
else
{
return text;
}
}
There is no such built in method to do your requirement
There is SplitQuotes method that does something
Input - This is a "very long" string
Output - This, is, a, very long, string
When you get a string from textbox or some control it comes with quotes.
If still you want to place quotes then you can use this kind of method
private string PlaceQuotes(string str, int startPosition, int lastPosition)
{
string quotedString = string.Empty;
string replacedString = str.Replace(str.Substring(0, startPosition),str.Substring(0, startPosition).Insert(startPosition, "'")).Substring(0, lastPosition).Insert(lastPosition, "'");
return String.Concat(replacedString, str.Remove(0, replacedString.Length));
}
Modern C# version below. Using string.Create() we avoid unnecessary allocations:
public static class StringExtensions
{
public static string Quote(this string s) => Surround(s, '"');
public static string Surround(this string s, char c)
{
return string.Create(s.Length + 2, s, (chars, state) =>
{
chars[0] = c;
state.CopyTo(chars.Slice(1));
chars[^1] = c;
});
}
}