C# Why i can not split the string? - c#

string myNumber = "3.44";
Regex regex1 = new Regex(".");
string[] substrings = regex1.Split(myNumber);
foreach (var substring in substrings)
{
Console.WriteLine("The string is : {0} and the length is {1}",substring, substring.Length);
}
Console.ReadLine();
I tried to split the string by ".", but it the splits return 4 empty string. Why?

. means "any character" in regular expressions. So don't split using a regex - split using String.Split:
string[] substrings = myNumber.Split('.');
If you really want to use a regex, you could use:
Regex regex1 = new Regex(#"\.");
The # makes it a verbatim string literal, to stop you from having to escape the backslash. The backslash within the string itself is an escape for the dot within the regex parser.

the easiest solution would be: string[] val = myNumber.Split('.');

. is a reserved character in regex. if you literally want to match a period, try:
Regex regex1 = new Regex(#"\.");
However, you're better off simply using myNumber.Split(".");

The dot matches a single character, without caring what that character
is. The only exception are newline characters.
Source: http://www.regular-expressions.info/dot.html
Therefore your implying in your code to split the string at each character.
Use this instead.
string substr = num.Split('.');

Keep it simple, use String.Split() method;
string[] substrings = myNumber.Split('.');
It has an other overload which allows specifying split options:
public string[] Split(
char[] separator,
StringSplitOptions options
)

You don't need regex you do that by using Split method of string object
string myNumber = "3.44";
String[] substrings = myNumber.Split(".");
foreach (var substring in substrings)
{
Console.WriteLine("The string is : {0} and the length is {1}",substring, substring.Length);
}
Console.ReadLine();

The period "." is being interpreted as any single character instead of a literal period.
Instead of using regular expressions you could just do:
string[] substrings = myNumber.Split(".");

In Regex patterns, the period character matches any single character. If you want the Regex to match the actual period character, you must escape it in the pattern, like so:
#"\."
Now, this case is somewhat simple for Regex matching; you could instead use String.Split() which will split based on the occurrence of one or more static strings or characters:
string[] substrings = myNumber.Split('.');

try
Regex regex1 = new Regex(#"\.");
EDIT: Er... I guess under a minute after Jon Skeet is not too bad, anyway...

You'll want to place an escape character before the "." - like this "\\."
"." in a regex matches any character, so if you pass 4 characters to a regex with only ".", it will return four empty strings. Check out this page for common operators.

Try
Regex regex1 = new Regex("[.]");

Related

Removing special characters using Regex in C#

I have one problem in this code. I want to remove all special characters but the square brackets are not getting removed.
string regExp = "[\\\"]";
string tmp = Regex.Replace(str, regExp," ");
string[] strArray = tmp.Split(',');
obj.amcid = db.Execute("select MAX(amcid)+1 from sca_amcmaster");
foreach (string i in strArray)
{
// int myInts = int.Parse(i);
db.Execute(";EXEC insertitems1 #0,#1", i, obj.invoiceno);
}
Square Brackets are metacharacters in Regular Expressions, which allow us to define list of things. So if you want to match then using Regex you need to change your expression to:
string regExp = "\[\\\"\]";
Therefore, you simply need to include the backslashes before the square brackets to match then too.
If none of them are required in the expression, you can group then using brackets, and the character ? (zero or more matches):
string regExp = "(\[)?(\\)?(\")?(\])?";

Remove numbers in specific part of string (within parentheses)

I have a string Test123(45) and I want to remove the numbers within the parenthesis. How would I go about doing that?
So far I have tried the following:
string str = "Test123(45)";
string result = Regex.Replace(str, "(\\d)", string.Empty);
This however leads to the result Test(), when it should be Test123().
tis replaces all parenthesis, filled with digits by parenthesis
string str = "Test123(45)";
string result = Regex.Replace(str, #"\(\d+\)", "()");
\d+(?=[^(]*\))
Try this.Use with verbatinum mode #.The lookahead will make sure number have ) without ( before it.Replace by empty string.
See demo.
https://regex101.com/r/uE3cC4/4
string str = "Test123(45)";
string result = Regex.Replace(str, #"\(\d+\)", "()");
you can also try this way:
string str = "Test123(45)";
string[] delimiters ={#"("};;
string[] split = str.Split(delimiters, StringSplitOptions.None);
var b=split[0]+"()";
Remove a number that is in fact inside parentheses BUT not the parentheses and keep anything else inside them that is not a number with C# Regex.Replace means matching all parenthetical substrings with \([^()]+\) and then removing all digits inside the MatchEvaluator.
Here is a C# sample program:
var str = "Test123(45) and More (5 numbers inside parentheses 123)";
var result = Regex.Replace(str, #"\([^()]+\)", m => Regex.Replace(m.Value, #"\d+", string.Empty));
// => Test123() and More ( numbers inside parentheses )
To remove digits that are enclosed in ( and ) symbols, the ASh's \(\d+\) solution will work well: \( matches a literal (, \d+ matches 1+ digits, \) matches a literal ).

get an special Substring in c#

I need to extract a substring from an existing string. This String starts with uninteresting characters (include "," "space" and numbers) and ends with ", 123," or ", 57," or something like this where the numbers can change. I only need the Numbers.
Thanks
public static void Main(string[] args)
{
string input = "This is 2 much junk, 123,";
var match = Regex.Match(input, #"(\d*),$"); // Ends with at least one digit
// followed by comma,
// grab the digits.
if(match.Success)
Console.WriteLine(match.Groups[1]); // Prints '123'
}
Regex to match numbers: Regex regex = new Regex(#"\d+");
Source (slightly modified): Regex for numbers only
I think this is what you're looking for:
Remove all non numeric characters from a string using Regex
using System.Text.RegularExpressions;
...
string newString = Regex.Replace(oldString, "[^.0-9]", "");
(If you don't want to allow the decimal delimiter in the final result, remove the . from the regular expression above).
Try something like this :
String numbers = new String(yourString.TakeWhile(x => char.IsNumber(x)).ToArray());
You can use \d+ to match all digits within a given string
So your code would be
var lst=Regex.Matches(inp,reg)
.Cast<Match>()
.Select(x=x.Value);
lst now contain all the numbers
But if your input would be same as provided in your question you don't need regex
input.Substring(input.LastIndexOf(", "),input.LastIndexOf(","));

Trim string by strings

How can I trim a string by a whole string instead of a list of single characters?
I want to remove all and whitespaces at beginning and end of an HTML string. But method String.Trim() does only have overloads for set of characters and not for set of strings.
You could use HttpUtility.HtmlDecode(String) and use the resultant as an input for String.Trim()
HttpUtility.HtmlDecode on MSDN
HttpServerUtility.HtmlDecode on MSDN (a wrapper you can access through the Page.Server property)
string stringWithNonBreakingSpaces;
string trimmedString = String.Trim(HttpUtility.HtmlDecode(stringWithNonBreakingSpaces));
Note: This solution would decode all the HTML strings in the input.
The Trim method removes from the current string all leading and trailing white-space characters by default.
Edit: Solution for your problem AFTER your edit:
string input = #" <a href='#'>link</a> ";
Regex regex = new Regex(#"^( |\s)*|( |\s)*$");
string result = regex.Replace(input, String.Empty);
This will remove all trailing and leading spaces and . You can add any string or character group to the expression. If you were to trim all tabs too the regex would simply become:
Regex regex = new Regex(#"^( |\s|\t)*|( |\s|\t)*$");
Not sure if this is what you're looking for?
string str = "hello ";
str.Replace(" ", "");
str.Trim();
Use RegEx, as David Heffernan said. It is rather easy to select all spaces at the start of string: ^(\ | )*

Removing numbers from text using C#

I have a text file for processing, which has some numbers. I want JUST text in it, and nothing else. I managed to remove the punctuation marks, but how do I remove the numbers? I want this using C# code.
Also, I want to remove words with length greater than 10. How do I do that using Reg Expressions?
You can do this with a regex:
string withNumbers = // string with numbers
string withoutNumbers = Regex.Replace(withNumbers, "[0-9]", "");
Use this regex to remove words with more than 10 characters:
[\w]{10, 100}
100 defines the max length to match. I don't know if there is a quantifier for min length...
Only letters and nothing else (because I see you also want to remove the punctuation marks)
Regex.IsMatch(input, #"^[a-zA-Z]+$");
You can also use string.Join:
string s = "asdasdad34534t3sdf43534";
s = string.Join(null, System.Text.RegularExpressions.Regex.Split(s, "[\\d]"));
The Regex.Replace method should do the trick.
// regex to match any digit
var regex = new Regex("\d");
// replace all matches in input with empty string
var output = regex.Replace(input, String.Empty);

Categories