Regular expression for parsing ::number::sentence:: [closed] - c#

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How will a regex for validation (::number::sentence::) such values look like?
::1::some text::
::2::some text's::
::234::some's text's::

You could use String.Split and avoid a regex completely if your string is as simple as this e.g.
var data = "::234::some's text's::".Split(new string[] { "::" }, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(data[0]); // 234
Console.WriteLine(data[1]); // some's text's
If you need to use it for validation you can still use the same logic as above e.g.
public bool Validate(string str)
{
var data = str.Split(new string[] { "::" }, StringSplitOptions.RemoveEmptyEntries);
double n;
return data.Length == 2 && Double.TryParse(data[0], out n) && !String.IsNullOrWhiteSpace(data[1]);
}
...
bool valid = Validate("::234::some's text's::");

Something like:
^::([0-9]+)::((?:(?!::).)*)::$
Example code:
Match match = Regex.Match("::1::some text::", "::([0-9]+)::((?:(?!::).)*)::");
var groups = match.Groups;
string num = groups[1].ToString();
string text = groups[2].ToString();
explanation:
^ Begin of the string
:: 2x ':'
([0-9]+) Match group 1, the 0-9 digits, one or more
:: 2x ':'
((?:(?!::).)*) Match group 2, any one character that isn't ::, zero or more
:: 2x ':'
$ End of the string
The ((?:(?!::).)*) requires a little more explanation... Let's peel it...
( ... ) the first '(' and last ')', match group 2
So now we have:
(?:(?!::).)*
so
(?: ... )* group without name (non capturing group) repeated 0 or more times. Its content will be put in match group 2 because it's in defined inside match group 2
composed of:
(?!::).
where
. is any character
BUT before "capturing" the "any character" make a check: (?!::) that the any character and the next one aren't :: (it's called zero-width negative lookahead)

Related

Splitting a string into characters, but keeping some together [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have this string: TF'E'
I want to split it to characters, but the '" character should join the character before it.
So it would look like this: T, F' and E'
You could use a regular expression to split the string at each position immediately before a new letter and an optional ':
var input = "TF'E'";
var output = Regex.Split(input, #"(?<!^)(?=\p{L}'?)");
output will now be a string array like ["T", "F'", "E'"]. The lookbehind (?<!^) ensure we never split at the start of the string, whereas the lookahead (?=\p{L}'?) describes one letter \p{L} followed by 0 or 1 '.
You can use a regex to capture "an uppercase character followed optionally by an apostrophe"
var mc = Regex.Matches(input, "(?<x>[A-Z]'?)");
foreach(Match m in mc)
Console.WriteLine(m.Groups["x"].Value);
If you don't like regex, you can use this method:
public static IEnumerable<string> Split(string input)
{
for(int i = 0; i < input.Length; i++)
{
if(i != (input.Length - 1) && input[i+1] == '\'')
{
yield return input[i].ToString() + input[i+1].ToString();
i++;
}
else
{
yield return input[i].ToString();
}
}
}
We loop through the input string. We check if there is a next character and if it is a '. If true, return the current character and the next character and increase the index by one. If false, just return the current character.
Online demo: https://dotnetfiddle.net/sPCftB

Lambda Expression : Pick out a substring from a larger string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
This is the string which iam trying to process
var str =
"$filter=HRRepName ne null and HRRepName ne '' and HRRepName eq 'jessica.l.hessling'&$top=1"
Currently using below code to get the substring - jessica.l.hessling
var repName = odataQuery
.Split(new string[] { "eq" }, StringSplitOptions.RemoveEmptyEntries)[1]
.Split(new char[] { (char)39 })[1]
.Replace("'", "")
.Trim();
But this index might create bug later hence i want to use lambda expression.
What I have tried till now :
var repName2 = odataQuery
.Split(new string[] { "HRRepName" }, StringSplitOptions.RemoveEmptyEntries)
.Select(s.Substring(s.IndexOf("eq",StringComparison.Ordinal)+1));
Well, I think Regex might be very good choice here, try below code:
var str = "$filter=HRRepName ne null and HRRepName ne '' and HRRepName eq 'jessica.l.hessling'&$top=1";
var match = (new Regex(#"HRRepName eq '([^']+)")).Match(str);
var extractedString = match.Success ? match.Groups[1] : null;
Explanation: HRRepName eq '([^']+) will match HRRepName eq ' literally, then it will match everything until ' character with ([^']+), brackets mean, that it will be stored in capture group.
You wrote:
this can be any name , i want the string right after eq but before '&'
To find whether items are in a string, and/or extract substrings from a string according to some pattern, RegEx is usually the way to go.
To fetch the data after the first eq and before the first & after this eq:
const string regexPattern = ".*eq(.*)&";
var match = RegEx.Match(regexPattern);
if (match.Success)
{ // found the pattern, it is in Match.Groups
ProcessMatch(match.Groups[1]); // [0] is complete matching string, [1] is first capture
}
The pattern:
*. start the string with zero or more characters
eq until the first occurrence of eq
(*.) capture zero or more characters
& until the first & after this eq
You can test this using one of the online RegEx pattern testers
The captured item is in Match.Groups. I haven't tested it, but as far as I remember, this is an IList, where element [0] is the complete matching string, 1 is the first captured item. Your debugger will show this to you.

How to remove sequence of characters from string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I want to remove an unknown number of character sequences B from a given string A.
The removing must start to the right of the position of a character sequence C. The removing must stop when the B character sequence ends.
Example for string A:
xxxxxxxxBxxxxxxxxxxxxxxxxxCBBBBBByyyyyyyyyByyyy
A ... sequence of characters from which B's that follow C must be removed
C ... a sequence of characters (example: 123)
B ... a sequence of characters (example: vbz)
x and y ... any characters
In this example all B's after C must be removed. All other B's must not be removed.
The result would be:
xxxxxxxxBxxxxxxxxxxxxxxxxxCyyyyyyyyyByyyy
I tried to use:
A = A.replace("vbz","");
but that removes every 'vbz' sequence from A.
How can I exclude the removal of those 'vbz' that are not preceeded by C?
Regards, Manu
Why don't you try this?
var.Replace("x", "");
var.Replace("y", "");
Just replace x and y with the unknown string sequence
string A = "xxxxxxxxBxxxxxxxxxxxxxxxxxCBBBBBByyyyyyyyyByyyy";
string pattern = #"(?<=C)[B]*";
string B = Regex.Replace(A, pattern, "");
As per your requirement, 2 conditions need to be satisfied for removing from a string :
1. unknown number of string sequences B
2. The removing must start to the right of the position of a string C
It can be achieved using Regex class of System.Text.RegularExpressions namespace.
string A = "xxxxxxxxBxxxxxxxxxxxxxxxxxCBBBBBByyyyyyyyyByyyy";
string pattern = "(?<=C)[b]*";
string result = Regex.Replace(A, pattern,"",RegexOptions.IgnoreCase);
Note :
pattern variable contains regex pattern.
(cb*) :
() : defines group of characters
c : starting string
b : B or b ; i.e, need to be replaced or removed
* : defines multiple number of characters defines before *
(?<=c) : Match any position following a prefix "c"
RegexOptions.IgnoreCase : it says the removed character can be any case like B or b

For loop check if string only has 3 capital letters followed by 4 numbers [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I input a string that has to start with three capital letters and ending with four digits (like so: "SJL1036") the program is just supposed to check if my input follows that model.
if i were to input "Sjl1036" og "SJL103" it would output that it is a false statement.
Try this regular expression. 3 uppercase, 4 numbers.
^[A-Z]{3}[0-9]{4}$
For example:
var value = "FSK2526";
if (Regex.IsMatch(value, #"^[A-Z]{3}[0-9]{4}$")) {
// it matches
}
Although you could do it with for loop, but you could simplify it further with regex like:
Regex regex = new Regex(#"^[A-Z]{3}.*[0-9]{4}$");
Match match = regex.Match("SJL1036");
if (match.Success)
{
Console.WriteLine(match.Value);
}
If this is the requirement:
A string that has to start with three capital letters and ending with
four digits
Probably the most efficient approach is using string methods:
bool valid = input.Length >= 7
&& input.Remove(3).All(Char.IsUpper) // or input.Substring(0, 3)
&& input.Substring(input.Length - 4).All(Char.IsDigit);
If the actual requirement is "3 capital letters followed by 4 numbers"(so 7 characters) you just need to change input.Length >= 7 to input.Length == 7.
A non-Regex option, You can use a bit of LINQ like:
string str = "SJL1036";
if (str.Length == 7 &&
str.Take(3).All(char.IsUpper)
&& str.Skip(3).All(char.IsDigit))
{
Console.WriteLine("valid");
}
else
{
Console.WriteLine("invalid");
}

How to remove extra hyphens from string in c#? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a string in which spaces are replaced by hyphen i.e '-' if there multiple hyphens then I want to remove all but one from the string. Only hyphens must be removed; not numbers that are consecutive.
Eg: --11- must be -11- and not -1-
Eg: --12- o/p: -12-
Eg: -12-- o/p: -12-
using Linq or a string function in C#.
I have tried it using str = str.Remove(str.Length - 1);, but it only removes one character.
If you just want to collapse multiple consecutive - characters into one, you could easily do this using regex:
string output = Regex.Replace(input, #"\-+", "-");
try
string sample = "--12";
string Reqdoutput = sample.Replace("--", "-");
If you want to replace just the hyphen, you can do one of the things given in the other answers. For removing all double characters, you can do this:
String input = "------hello-----";
int i = 1;
while (i < input.Length)
{
if (input[i] == input[i - 1])
{
input = input.Remove(i, 1);
}
else
{
i++;
}
}
Console.WriteLine(input); // Will give "-helo-"
Why not just do :
yourString = yourString.Replace("--", "-");
Or did I understand the problem wrong ?

Categories