Finding the longest substring regex? - c#

Someone knows how to find the longest substring composed of letters using using MatchCollection.
public static Regex pattern2 = new Regex("[a-zA-Z]");
public static string zad3 = "ala123alama234ijeszczepsa";

You can loop over all matches and get the longest:
string max = "";
foreach (Match match in Regex.Matches(zad3, "[a-zA-Z]+"))
if (max.Length < match.Value.Length)
max = match.Value;

Try this:
MatchCollection matches = pattern2.Matches(txt);
List<string> strLst = new List<string>();
foreach (Match match in matches)
strLst.Add(match.Value);
var maxStr1 = strLst.OrderByDescending(s => s.Length).First();
or better way :
var maxStr2 = matches.Cast<Match>().Select(m => m.Value).ToArray().OrderByDescending(s => s.Length).First();

best solution for your task is:
string zad3 = "ala123alama234ijeszczepsa54dsfd";
string max = Regex.Split(zad3,#"\d+").Max(x => x);

You must change your Regex pattern to include the repetition operator + so that it matches more than once.
[a-zA-Z] should be [a-zA-Z]+
You can get the longest value using LINQ. Order by the match length descending and then take the first entry. If there are no matches the result is null.
string pattern2 = "[a-zA-Z]+";
string zad3 = "ala123alama234ijeszczepsa";
var matches = Regex.Matches(zad3, pattern2);
string result = matches
.Cast<Match>()
.OrderByDescending(x => x.Value.Length)
.FirstOrDefault()?
.Value;
The string named result in this example is:
ijeszczepsa

Using linq and the short one:
string longest= Regex.Matches(zad3, pattern2).Cast<Match>()
.OrderByDescending(x => x.Value.Length).FirstOrDefault()?.Value;

you can find it in O(n) like this (if you do not want to use regex):
string zad3 = "ala123alama234ijeszczepsa";
int max=0;
int count=0;
for (int i=0 ; i<zad3.Length ; i++)
{
if (zad3[i]>='0' && zad3[i]<='9')
{
if (count > max)
max=count;
count=0;
continue;
}
count++;
}
if (count > max)
max=count;
Console.WriteLine(max);

Related

Getting a numbers from a string with chars glued

I need to recover each number in a glued string
For example, from these strings:
string test = "number1+3"
string test1 = "number 1+4"
I want to recover (1 and 3) and (1 and 4)
How can I do this?
CODE
string test= "number1+3";
List<int> res;
string[] digits= Regex.Split(test, #"\D+");
foreach (string value in digits)
{
int number;
if (int.TryParse(value, out number))
{
res.Add(number)
}
}
This regex should work
string pattern = #"\d+";
string test = "number1+3";
foreach (Match match in Regex.Matches(test, pattern))
Console.WriteLine("Found '{0}' at position {1}",
match.Value, match.Index);
Note that if you intend to use it multiple times, it's better, for performance reasons, to create a Regex instance than using this static method.
var res = new List<int>();
var regex = new Regex(#"\d+");
void addMatches(string text) {
foreach (Match match in regex.Matches(text))
{
int number = int.Parse(match.Value);
res.Add(number);
}
}
string test = "number1+3";
addMatches(test);
string test1 = "number 1+4";
addMatches(test1);
MSDN link.
Fiddle 1
Fiddle 2
This calls for a regular expression:
(\d+)\+(\d+)
Test it
Match m = Regex.Match(input, #"(\d+)\+(\d+)");
string first = m.Groups[1].Captures[0].Value;
string second = m.Groups[2].Captures[0].Value;
An alternative to regular expressions:
string test = "number 1+4";
int[] numbers = test.Replace("number", string.Empty, StringComparison.InvariantCultureIgnoreCase)
.Trim()
.Split("+", StringSplitOptions.RemoveEmptyEntries)
.Select(x => Convert.ToInt32(x))
.ToArray();

Create Regex Pattern for String using C#

I have string pattern like this:
#c1 12,34,222x8. 45,989,100x10. 767x55. #c1
I want to change these patterns into this:
c1,12,8
c1,34,8
c1,222,8
c1,45,10
c1,989,10
c1,100,10
c1,767,55
My code in C#:
private void btnProses_Click(object sender, EventArgs e)
{
String ps = txtpesan.Text;
Regex rx = new Regex("((?:\d+,)*(?:\d+))x(\d+)");
Match mc = rx.Match(ps);
while (mc.Success)
{
txtpesan.Text = rx.ToString();
}
}
I've been using split and replace but to no avail. After I tried to solve this problem, I see many people using regex, I tried to use regex but I do not get the logic of making a pattern regex.
What should I use to solve this problem?
sometimes regex is not good approach - old school way wins. Assuming valid input:
var tokens = txtpesan.Text.Split(' '); //or use split by regex's whitechar
var prefix = tokens[0].Trim('#');
var result = new StringBuilder();
//skip first and last token
foreach (var token in tokens.Skip(1).Reverse().Skip(1).Reverse())
{
var xIndex = token.IndexOf("x");
var numbers = token.Substring(0, xIndex).Split(',');
var lastNumber = token.Substring(xIndex + 1).Trim('.');
foreach (var num in numbers)
{
result.AppendLine(string.Format("{0},{1},{2}", prefix, num, lastNumber));
}
}
var viola = result.ToString();
Console.WriteLine(viola);
And here comes a somewhat ugly regex based solution:
var q = "#c1 12,34,222x8. 45,989,100x10. 767x55. #c1";
var results = Regex.Matches(q, #"(?:(?:,?\b(\d+))(?:x(\d+))?)+");
var caps = results.Cast<Match>()
.Select(m => m.Groups[1].Captures.Cast<Capture>().Select(cap => cap.Value));
var trailings = results.Cast<Match>().Select(m => m.Groups[2].Value).ToList();
var c1 = q.Split(' ')[0].Substring(1);
var cnt = 0;
foreach (var grp in caps)
{
foreach (var item in grp)
{
Console.WriteLine("{0},{1},{2}", c1, item, trailings[cnt]);
}
cnt++;
}
The regex demo can be seen here. The pattern matches blocks of comma-separated digits while capturing the digits into Group 1, and captures the digits after x into Group 2. Could not get rid of the cnt counter, sorry.

Regex to split by a Targeted String up to a certain character

I have an LDAP Query I need to build the domain.
So, split by "DC=" up to a "comma"
INPUT:
LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account
RESULT:
SOMETHING.ELSE.NET
You can do it pretty simple using DC=(\w*) regex pattern.
var str = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var result = String.Join(".", Regex.Matches(str, #"DC=(\w*)")
.Cast<Match>()
.Select(m => m.Groups[1].Value));
Without Regex you can do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
int startIndex = ldapStr.IndexOf("DC=");
int length = ldapStr.LastIndexOf("DC=") - startIndex;
string output = null;
if (startIndex >= 0 && length <= ldapStr.Length)
{
string domainComponentStr = ldapStr.Substring(startIndex, length);
output = String.Join(".",domainComponentStr.Split(new[] {"DC=", ","}, StringSplitOptions.RemoveEmptyEntries));
}
If you are always going to get the string in similar format than you can also do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var outputStr = String.Join(".", ldapStr.Split(new[] {"DC=", ",","\\"}, StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Take(3));
And you will get:
outputStr = "SOMETHINGS.ELSE.NET"

Extracting ONLY number values between specific characters while ignoring string-number combination

I have for example this string,
|SomeText1|123|0#$0#62|SomeText2|456|6#83|SomeText3#61|SomeText1#41|SomeText5#62|SomeText3#82|SomeText9#40|SomeText2#$1#166|SomeText2|999|7#146|SomeText2#167|SomeText2#166|
I want to extract only number values and add them to list and later sum them. That means values,
|123|,|456|, |999|.
All other values like,
|SomeText1|,|SomeText2|,|SomeText2#$1#166|
shouldn't be in list.
I'm working with C#. I tried something like,
int sum = 0;
List<int> results = new List<int>();
Regex regexObj = new Regex(#"\|(.*?)\|");
Match matchResults = regexObj.Match(s);
while (matchResults.Success)
{
results.Add(Convert.ToInt32(matchResults));
matchResults = matchResults.NextMatch();
}
for (int i = 0; i < results.Count; i++)
{
int bonusValues = results[i];
sum = sum + bonusValues;
}
So basic idea is to extract values between | | characters and ignore one that are not pure digits like
|#16543TextBL#aBLa564B|
string input = #"|SomeText1|123|0#$0#62|SomeText2|456|6#83|SomeText3#61|SomeText1#41|SomeText5#62|SomeText3#82|SomeText9#40|SomeText2#$1#166|SomeText2|999|7#146|SomeText2#167|SomeText2#166|";
var numbers = Regex.Matches(input, #"\|(\d+)\|")
.Cast<Match>()
.Select(m => m.Groups[1].Value).ToList();
var sum = numbers.Sum(n => int.Parse(n));
If regex isn't a definite requirement, you could use linq
stringName.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Where(x => x.All(char.IsNumber)).ToList();
With sum
stringName.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Where(x => x.All(char.IsNumber)).Sum(x => int.Parse(x));
you have to tried below menioned code
Regex regexObj = new Regex(#"\d+");
\d represents any digit, + for one or more.
If you want to catch negative numbers as well you can use -?\d+.
Regex regexObj = new Regex(#"-?\d+");
Note that as a string, it should be represented in C# as "\d+", or #"\d+"
if you don't need specific the regex stuff you could easily split the string with:
string[] splits = s.split('|');
then you could loop through this set of strings and try to parse it to an integer. I
for(int i=0; i<splits.size(); i++){
int num;
bool isNum = int.TryParse(splits[i], out num);
if(isNum){
list.add(num);
}
}

Extracting strings in .NET

I have a string that looks like this:
var expression = #"Args("token1") + Args("token2")";
I want to retrieve a collection of strings that are enclosed in Args("") in the expression.
How would I do this in C# or VB.NET?
Regex:
string expression = "Args(\"token1\") + Args(\"token2\")";
Regex r = new Regex("Args\\(\"([^\"]+)\"\\)");
List<string> tokens = new List<string>();
foreach (var match in r.Matches(expression)) {
string s = match.ToString();
int start = s.IndexOf('\"');
int end = s.LastIndexOf('\"');
tokens.add(s.Substring(start + 1, end - start - 1));
}
Non-regex (this assumes that the string in the correct format!):
string expression = "Args(\"token1\") + Args(\"token2\")";
List<string> tokens = new List<string>();
int index;
while (!String.IsNullOrEmpty(expression) && (index = expression.IndexOf("Args(\"")) >= 0) {
int start = expression.IndexOf('\"', index);
string s = expression.Substring(start + 1);
int end = s.IndexOf("\")");
tokens.Add(s.Substring(0, end));
expression = s.Substring(end + 2);
}
There is another regular expression method for accomplishing this, using lookahead and lookbehind assertions:
Regex regex = new Regex("(?<=Args\\(\").*?(?=\"\\))");
string input = "Args(\"token1\") + Args(\"token2\")";
MatchCollection matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match.ToString());
}
This strips away the Args sections of the string, giving just the tokens.
If you want token1 and token2, you can use following regex
input=#"Args(""token1"") + Args(""token2"")"
MatchCollection matches = Regex.Matches(input,#"Args\(""([^""]+)""\)");
Sorry, If this is not what you are looking for.
if your collection looks like this:
IList<String> expression = new List<String> { "token1", "token2" };
var collection = expression.Select(s => Args(s));
As long as Args returns the same type as the queried collection type this should work okay
you can then iterate over the collection like so
foreach (var s in collection)
{
Console.WriteLine(s);
}

Categories