Lambda Expression : Pick out a substring from a larger string [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
This is the string which iam trying to process
var str =
"$filter=HRRepName ne null and HRRepName ne '' and HRRepName eq 'jessica.l.hessling'&$top=1"
Currently using below code to get the substring - jessica.l.hessling
var repName = odataQuery
.Split(new string[] { "eq" }, StringSplitOptions.RemoveEmptyEntries)[1]
.Split(new char[] { (char)39 })[1]
.Replace("'", "")
.Trim();
But this index might create bug later hence i want to use lambda expression.
What I have tried till now :
var repName2 = odataQuery
.Split(new string[] { "HRRepName" }, StringSplitOptions.RemoveEmptyEntries)
.Select(s.Substring(s.IndexOf("eq",StringComparison.Ordinal)+1));

Well, I think Regex might be very good choice here, try below code:
var str = "$filter=HRRepName ne null and HRRepName ne '' and HRRepName eq 'jessica.l.hessling'&$top=1";
var match = (new Regex(#"HRRepName eq '([^']+)")).Match(str);
var extractedString = match.Success ? match.Groups[1] : null;
Explanation: HRRepName eq '([^']+) will match HRRepName eq ' literally, then it will match everything until ' character with ([^']+), brackets mean, that it will be stored in capture group.

You wrote:
this can be any name , i want the string right after eq but before '&'
To find whether items are in a string, and/or extract substrings from a string according to some pattern, RegEx is usually the way to go.
To fetch the data after the first eq and before the first & after this eq:
const string regexPattern = ".*eq(.*)&";
var match = RegEx.Match(regexPattern);
if (match.Success)
{ // found the pattern, it is in Match.Groups
ProcessMatch(match.Groups[1]); // [0] is complete matching string, [1] is first capture
}
The pattern:
*. start the string with zero or more characters
eq until the first occurrence of eq
(*.) capture zero or more characters
& until the first & after this eq
You can test this using one of the online RegEx pattern testers
The captured item is in Match.Groups. I haven't tested it, but as far as I remember, this is an IList, where element [0] is the complete matching string, 1 is the first captured item. Your debugger will show this to you.

Related

Splitting a string into characters, but keeping some together [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have this string: TF'E'
I want to split it to characters, but the '" character should join the character before it.
So it would look like this: T, F' and E'
You could use a regular expression to split the string at each position immediately before a new letter and an optional ':
var input = "TF'E'";
var output = Regex.Split(input, #"(?<!^)(?=\p{L}'?)");
output will now be a string array like ["T", "F'", "E'"]. The lookbehind (?<!^) ensure we never split at the start of the string, whereas the lookahead (?=\p{L}'?) describes one letter \p{L} followed by 0 or 1 '.
You can use a regex to capture "an uppercase character followed optionally by an apostrophe"
var mc = Regex.Matches(input, "(?<x>[A-Z]'?)");
foreach(Match m in mc)
Console.WriteLine(m.Groups["x"].Value);
If you don't like regex, you can use this method:
public static IEnumerable<string> Split(string input)
{
for(int i = 0; i < input.Length; i++)
{
if(i != (input.Length - 1) && input[i+1] == '\'')
{
yield return input[i].ToString() + input[i+1].ToString();
i++;
}
else
{
yield return input[i].ToString();
}
}
}
We loop through the input string. We check if there is a next character and if it is a '. If true, return the current character and the next character and increase the index by one. If false, just return the current character.
Online demo: https://dotnetfiddle.net/sPCftB

All regex matches inside of one value [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I input a string into this function such as var o="ok". This works but when I try two or more the value is stuff like b"var o="ok.
I have already tried every match method I know but it doesn't work, I can't find anything wrong wrong with the pattern.
public List<Varible> GetVars(string code)
{
List<Varible> vars = new List<Varible>();
Regex dagu = new Regex("var\\s+\\w+=(\\s+|)\"(.+|)\"");
Match reg = dagu.Match(code);
while (reg.Success) {
Match fef = reg;
Varible v = new Varible();
v.vartype = vartype.o_string;
v.name = fef.Value.Substring(fef.Value.IndexOf("r") + 1, fef.Value.IndexOf("=") - fef.Value.IndexOf("r") - 1);
int b = fef.Value.LastIndexOf("\"");
int f = fef.Value.IndexOf("\"");
v.value = fef.Value.Substring(f + 1, b - f - 1);
vars.Add(v);
reg = reg.NextMatch();
}
return vars;
}
There are no errors reported.
About your pattern:
"var\\s+\\w+=(\\s+|)\"(.+|)\""
A): The two capturing groups are not exactly wrong, but they're oddly written. (\\s+|) captures "one or more spaces, OR nothing at all", which can be expressed as "zero or more spaces". In regex, the "zero or more" quantifier is the star, so you can replace this group with (\\s*). Same for (.+|) which becomes (.*).
Then there's the issue with trying to match this pattern multiple times: the .+/.*pattern can, and will match quotes. To avoid this you can replace the dot by a negated character class that will match anything except for quotes: [^\"].
So now your pattern should look like this:
"var\\s+\\w+=(\\s*)\"([^\"]*)\""
B): Then... You don't use any of this. You seem to only use the match as a whole and redo the pattern by hand to get the part you need. I understand you added the parentheses in your pattern to be able to use the | operator, but they also have the nifty effect of creating capturing groups. A capturing group, for short, is what you're asking your regex engine to specifically look for and point out for you when matching. You can access these groups with the Match.Groups property. In your case, because there are two pairs of parentheses in your pattern, you'll create two groups, the first one contains the spacing between the equal sign and the first quote of your input. You don't seem to need it, so let's remove it, and instead capture the name of your 'var':
"var\\s+(\\w+)=\\s*\"([^\"]*)\""
You can now access the var's name with reg.Groups[1].Value and its value with Groups[2], and before you ask, reg.Groups[0].Value does exist but it always stores the entire match, so for your purposes it's the same as reg.Value.
Now to overhaul all this code:
public List<Varible> GetVars(string code)
{
List<Varible> vars = new List<Varible>();
Regex dagu = new Regex("var\\s+(\\w+)=\\s*\"([^\"]*)\"");
Match reg = dagu.Match(code);
while (reg.Success)
{
Varible v = new Varible();
v.vartype = vartype.o_string;
v.name = reg.Groups[1].Value;
v.value = reg.Groups[2].Value;
vars.Add(v);
reg = reg.NextMatch();
}
return vars;
}
and you should be good.

C# Regular Expression Capturing Empty String [duplicate]

This question already has answers here:
C# Regex.Split: Removing empty results
(9 answers)
Closed 5 years ago.
I'm trying to create a simple regular expression in C# to split a string into tokens. The problem I'm running into is that the pattern I'm using captures an empty string, which throws off my expected results. What can I do to change my regular expression so it doesn't capture an empty string?
var input = "ID=123&User=JohnDoe";
var pattern = "(?:id=)|(?:&user=)";
var tokens = Regex.Split(input, pattern, RegexOptions.IgnoreCase);
// Expected Results
// tokens[0] == "123"
// tokens[1] == "JohnDoe"
// Actual Results
// tokens[0] == ""
// tokens[1] == "123"
// tokens[2] == "JohnDoe"
While the comments to your OP regarding using a different approach may have merit, they don't address your specific question regarding the RegEx behavior.
I think that the reason though you're getting the regex behavior has to do with an implicit capture group (ed: or it could just be limiting the capture behavior of the first group is sufficient), but I haven't made it to the top level of the RegEx hierarchy of understanding.
Edit:
Working RegEx for the given test case:
(?>id=)|(?:&user=)
If none of this is to your liking, you could always tack a predicate to the tokens list:
tokens.Where(x => !string.IsNullOrWhiteSpace(x))
I don't think you can solve this problem with Regex.Split to be honest. One brute force way to do this is to remove every "":
var input = "ID=123&User=JohnDoe";
var pattern = "(?:id=)|(?:&user=)";
var tokens = Regex.Split(input, pattern, RegexOptions.IgnoreCase).Where(x => x != "");
I think you should use regex that actually captures the tokens in groups.
var input = "ID=123&User=JohnDoe";
var pattern = "id=(.+)&user=(.+)";
var match = Regex.Match(input, pattern, RegexOptions
.IgnoreCase);
match.Groups[1] // 123

How to find the capital substring of a string? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I am trying to find the capitalized portion of a string, to then insert two characters that represent the Double Capital sign in the Braille language. My intention for doing this is to design a translator that can translate from regular text to Braille.
I'll give an example belo.
English String: My variable is of type IEnumerable.
Braille: ,My variable is of type ,,IE-numberable.
I also want the dash in IE-numerable to only break words that have upper and lower case, but not in front of punctuation marks, white spaces, numbers or other symbols.
Thanks a lot in advance for your answers.
I had never heard of a "Double Capital" sign, so I read up on it here. From what I can tell, this should suit your needs.
You can use this to find any sequence of two or more uppercase (majuscule) Latin letters or hyphens in your string:
var matches = Regex.Matches(input, "[A-Z-]{2,}");
You can use this to insert the double-capital sign:
var result = Regex.Replace(input, "[A-Z-]{2,}", ",,$0");
For example:
var input = "this is a TEST";
var result = Regex.Replace(input, "[A-Z-]{2,}", ",,$0"); // this is a ,,TEST
You can use this to hand single and double capitals:
var input = "McGRAW-HILL";
var result = Regex.Replace(input, "[A-Z-]([A-Z-]+)?",
m => (m.Groups[1].Success ? ",," : ",") + m.Value); // ,Mc,,GRAW-HILL
You can find them with a simple regex:
using System.Text.RegularExpressions;
// ..snip..
Regex r = new Regex("[A-Z]"); // This will capture only upper case characters
Match m = r.Match(input, 0);
The variable m of type System.Text.RegularExpressions.Match will contain a collection of captures. If only the first match matters, you can check its Index property directly.
Now you can insert the characters you want in that position, using String.Insert:
input = input.Insert(m.Index, doubleCapitalSign);
this code can solve your problema
string x = "abcdEFghijkl";
string capitalized = string.Empty;
for (int i = 0; i < x.Length; i++)
{
if (x[i].ToString() == x[i].ToString().ToUpper())
capitalized += x[i];
}
Have you tried using the method Char.IsUpper method
http://msdn.microsoft.com/en-us/library/9s91f3by.aspx
This is another similar question that uses that method to solve a similar problem
Get the Index of Upper Case letter from a String
If you just want to find the first index of an uppercase letter:
var firstUpperCharIndex = text // <-- a string
.Select((chr, index) => new { chr, index })
.FirstOrDefault(x => Char.IsUpper(x.chr));
if (firstUpperCharIndex != null)
{
text = text.Insert(firstUpperCharIndex.index, ",,");
}
Not sure if this is what you are going for?
var inputString = string.Empty; //Your input string here
var output = new StringBuilder();
foreach (var c in inputString.ToCharArray())
{
if (char.IsUpper(c))
{
output.AppendFormat("_{0}_", c);
}
else
{
output.Append(c);
}
}
This will loop through each character in the inputString if the characater is upper it inserts a _ before and after (replace that with your desired braille characters) otherwise appends the character to the output.

Regular expression for parsing ::number::sentence:: [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
How will a regex for validation (::number::sentence::) such values look like?
::1::some text::
::2::some text's::
::234::some's text's::
You could use String.Split and avoid a regex completely if your string is as simple as this e.g.
var data = "::234::some's text's::".Split(new string[] { "::" }, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(data[0]); // 234
Console.WriteLine(data[1]); // some's text's
If you need to use it for validation you can still use the same logic as above e.g.
public bool Validate(string str)
{
var data = str.Split(new string[] { "::" }, StringSplitOptions.RemoveEmptyEntries);
double n;
return data.Length == 2 && Double.TryParse(data[0], out n) && !String.IsNullOrWhiteSpace(data[1]);
}
...
bool valid = Validate("::234::some's text's::");
Something like:
^::([0-9]+)::((?:(?!::).)*)::$
Example code:
Match match = Regex.Match("::1::some text::", "::([0-9]+)::((?:(?!::).)*)::");
var groups = match.Groups;
string num = groups[1].ToString();
string text = groups[2].ToString();
explanation:
^ Begin of the string
:: 2x ':'
([0-9]+) Match group 1, the 0-9 digits, one or more
:: 2x ':'
((?:(?!::).)*) Match group 2, any one character that isn't ::, zero or more
:: 2x ':'
$ End of the string
The ((?:(?!::).)*) requires a little more explanation... Let's peel it...
( ... ) the first '(' and last ')', match group 2
So now we have:
(?:(?!::).)*
so
(?: ... )* group without name (non capturing group) repeated 0 or more times. Its content will be put in match group 2 because it's in defined inside match group 2
composed of:
(?!::).
where
. is any character
BUT before "capturing" the "any character" make a check: (?!::) that the any character and the next one aren't :: (it's called zero-width negative lookahead)

Categories