Regex - Removing specific characters before the final occurance of #

Regex - Removing specific characters before the final occurance of # - c#

So, I'm trying to remove certain characters [.&#] before the final occurance of an #, but after that final #, those characters should be allowed.
This is what I have so far.
string pattern = #"\.|\&|\#(?![^#]+$)|[^a-zA-Z#]";
string input = "username#middle&something.else#company.com";
// input, pattern, replacement
string result = Regex.Replace(input, pattern, string.Empty);
Console.WriteLine(result);
Output: usernamemiddlesomethingelse#companycom
This currently removes all occurances of the specified characters, apart from the final #. I'm not sure how to get this to work, help please?

You may use
[.&#]+(?=.*#)
Or, equivalent [.&#]+(?![^#]*$). See the regex demo.
Details
[.&#]+ - 1 or more ., & or # chars
(?=.*#) - followed with any 0+ chars (other than LF) as many as possible and then a #.
See the C# demo:
string pattern = #"[.&#]+(?=.*#)";
string input = "username#middle&something.else#company.com";
string result = Regex.Replace(input, pattern, string.Empty);
Console.WriteLine(result);
// => usernamemiddlesomethingelse#company.com

Just a simple solution (and alternative to complex regex) using Substring and LastIndexOf:
string pattern = #"[.#&]";
string input = "username#middle&something.else#company.com";
string inputBeforeLastAt = input.Substring(0, input.LastIndexOf('#'));
// input, pattern, replacement
string result = Regex.Replace(inputBeforeLastAt, pattern, string.Empty) + input.Substring(input.LastIndexOf('#'));
Console.WriteLine(result);
Try it with this fiddle.

Related

Replace ab in string |var11=ab|var12=100|var21=cd|var22=200| using regular expression

i want to replace ab followed by var11 in given string
Input:|var11=ab|var12=100|var21=cd|var22=200|
My code is as follows:
string input = "|var11=ab|var12=100|var21=cd|var22=200|";
string pattern = #"^.var11=([a-z]+).";
string value = Regex.Replace(input, pattern, "ep");
and the output I got is:
epvar12=100|var21=cd|var22=200|
but the expected output was:
|var11=ep|var12=100|var21=cd|var22=200|

You may use
string input = "|var11=ab|var12=100|var21=cd|var22=200|";
string pattern = #"(?<=\bvar11=)[^|]+";
string value = Regex.Replace(input, pattern, "ep");
Or, a capturing group approach:
string pattern = #"\b(var11=)[^|]+";
string value = Regex.Replace(input, pattern, "${1}ep");
See the .NET regex demo
Details
(?<=\bvar11=) - a location immediately preceded with a whole word var11=
[^|]+ - 1+ non-pipe chars.
If you want to update the var11 value only when it is preceded with | or at the start of string use
string pattern = #"(?<=(?:^|\|)var11=)[^|]+";
where (?:^|\|) matches start of string (^) or (|) a pipe char (\|).

How to remove multiple first characters using regex?

I have string string A = "... :-ggw..-:p";
using regex: string B = Regex.Replace(A, #"^\.+|:|-|", "").Trim();
My Output isggw..p.
What I want is ggw..-:p.
Thanks

You may use a character class with your symbols and whitespace shorthand character class:
string B = Regex.Replace(A, #"^[.:\s-]+", "");
See the regex demo
Details
^ - start of string
[.:\s-]+ - one or more characters defined in the character class.
Note that there is no need escaping . inside [...]. The - does not have to be escaped since it is at the end of the character class.

A regex isn't necessary if you only want to trim specific characters from the start of a string. System.String.TrimStart() will do the job:
var source = "... :-ggw..-:p";
var charsToTrim = " .:-".ToCharArray();
var result = source.TrimStart(charsToTrim);
Console.WriteLine(result);
// Result is 'ggw..-:p'

c# regex replace everything including a word base64, with nothing and keeping rest of the string

I am wanting to take a string and find base64, and get rid of that and everything prior to that
example
"asdfjljlkjaldf_base64,234u0909230948098234082304802384023094"
Notice "base64," ... I want ONLY everything after "base64,"
Desired: "234u0909230948098234082304802384023094"
I was looking at this code
"string test = "hello, base64, matching";
string regexStrTest;
regexStrTest = #"test\s\w+";
MatchCollection m1 = Regex.Matches(base64,, regexStrTest);
//gets the second matched value
string value = m1[1].Value;
but that is not quite what I want..

Why regular expressions? IndexOf + Substring seems to be quite enough:
string source = "asdfjljlkjaldf_base64,234u0909230948098234082304802384023094";
string tag = "base64,";
string result = source.Substring(source.IndexOf(tag) + tag.Length);

You tried a regex that matches test, a whitespace, and 1+ word chars. The input string just did not match it.
You may use
var results = Regex.Matches(s, #"base64,(\w+)")
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.ToList();
See the regex demo.
The pattern matches base64, substring and then captures into Group 1 one or more word chars with (\w+) pattern. The captured value is stored inside match.Groups[1].Value, just what you get with .Select(m => m.Groups[1].Value).

Some of the other answers are good. Here is a very simple regex
string yourData = "asdfjljlkjaldf_base64,234u0909230948098234082304802384023094";
var newString = Regex.Replace(yourData, "^.*base64,", "");

search string for everything before a set of characters in C#

I'm looking for a way to search a string for everything before a set of characters in C#. For Example, if this is my string value:
This is is a test.... 12345
I want build a new string with all of the characters before "12345".
So my new string would equal "This is is a test.... "
Is there a way to do this?
I've found Regex examples where you can focus on one character but not a sequence of characters.

You don't need to use a Regex:
public string GetBitBefore(string text, string end)
{
var index = text.IndexOf(end);
if (index == -1) return text;
return text.Substring(0, index);
}

You can use a lazy quantifier to match anything, followed by a lookahead:
var match = Regex.Match("This is is a test.... 12345", #".*?(?=\d{5})");
where:
.*? lazily matches everything (up to the lookahead)
(?=…) is a positive lookahead: the pattern must be matched, but is not included in the result
\d{5} matches exactly five digits. I'm assuming this is your lookahead; you can replace it

You can do so with help of regex lookahead.
.*(?=12345)
Example:
var data = "This is is a test.... 12345";
var rxStr = ".*(?=12345)";
var rx = new System.Text.RegularExpressions.Regex (rxStr,
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
var match = rx.Match(data);
if (match.Success) {
Console.WriteLine (match.Value);
}
Above code snippet will print every thing upto 12345:
This is is a test....
For more detail about see regex positive lookahead

This should get you started:
var reg = new Regex("^(.+)12345$");
var match = reg.Match("This is is a test.... 12345");
var group = match.Groups[1]; // This is is a test....
Of course you'd want to do some additional validation, but this is the basic idea.

^ means start of string
$ means end of string
The asterisk tells the engine to attempt to match the preceding token zero or more times. The plus tells the engine to attempt to match the preceding token once or more
{min,max} indicate the minimum/maximum number of matches.
\d matches a single character that is a digit, \w matches a "word character" (alphanumeric characters plus underscore), and \s matches a whitespace character (includes tabs and line breaks).
[^a] means not so exclude a
The dot matches a single character, except line break characters
In your case there many way to accomplish the task.
Eg excluding digit: ^[^\d]*
If you know the set of characters and they are not only digit, don't use regex but IndexOf(). If you know the separator between first and second part as "..." you can use Split()

Take a look at this snippet:
class Program
{
static void Main(string[] args)
{
string input = "This is is a test.... 12345";
// Here we call Regex.Match.
MatchCollection matches = Regex.Matches(input, #"(?<MySentence>(\w+\s*)*)(?<MyNumberPart>\d*)");
foreach (Match item in matches)
{
Console.WriteLine(item.Groups["MySentence"]);
Console.WriteLine("******");
Console.WriteLine(item.Groups["MyNumberPart"]);
}
Console.ReadKey();
}
}

You could just split, not as optimal as the indexOf solution
string value = "oiasjdoiasj12345";
string end = "12345";
string result = value.Split(new string[] { end }, StringSplitOptions.None)[0] //Take first part of the result, not the quickest but fairly simple

Extract string that contains only letters in C#

string input = "5991 Duncan Road";
var onlyLetters = new String(input.Where(Char.IsLetter).ToArray());
Output: DuncanRoad
But I am expecting output is Duncan Road. What need to change ?

For the input like yours, you do not need a regex, just skip all non-letter symbols at the beginning with SkipWhile():
Bypasses elements in a sequence as long as a specified condition is true and then returns the remaining elements.
C# code:
var input = "5991 Duncan Road";
var onlyLetters = new String(input.SkipWhile(p => !Char.IsLetter(p)).ToArray());
Console.WriteLine(onlyLetters);
See IDEONE demo
A regx solution that will remove numbers that are not part of words and also adjoining whitespace:
var res = Regex.Replace(str, #"\s+(?<!\p{L})\d+(?!\p{L})|(?<!\p{L})\d+(?!\p{L})\s+", string.Empty);

You can use this lookaround based regex:
repl = Regex.Replace(input, #"(?<![a-zA-Z])[^a-zA-Z]|[^a-zA-Z](?![a-zA-Z])", "");
//=> Duncan Road
(?<![a-zA-Z])[^a-zA-Z] matches a non-letter that is not preceded by another letter.
| is regex alternation
[^a-zA-Z](?![a-zA-Z]) matches a non-letter that is not followed by another letter.
RegEx Demo

You can still use LINQ filtering with Char.IsLetter || Char.IsWhiteSpace. To remove all leading and trailing whitespace chars you can call String.Trim:
string input = "5991 Duncan Road";
string res = String.Join("", input.Where(c => Char.IsLetter(c) || Char.IsWhiteSpace(c)))
.Trim();
Console.WriteLine(res); // Duncan Road

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex - Removing specific characters before the final occurance of # - c#

Related

Replace ab in string |var11=ab|var12=100|var21=cd|var22=200| using regular expression

How to remove multiple first characters using regex?

c# regex replace everything including a word base64, with nothing and keeping rest of the string

search string for everything before a set of characters in C#

Extract string that contains only letters in C#

Categories

Resources