Create Regex Pattern for String using C# - c#

I have string pattern like this:
#c1 12,34,222x8. 45,989,100x10. 767x55. #c1
I want to change these patterns into this:
c1,12,8
c1,34,8
c1,222,8
c1,45,10
c1,989,10
c1,100,10
c1,767,55
My code in C#:
private void btnProses_Click(object sender, EventArgs e)
{
String ps = txtpesan.Text;
Regex rx = new Regex("((?:\d+,)*(?:\d+))x(\d+)");
Match mc = rx.Match(ps);
while (mc.Success)
{
txtpesan.Text = rx.ToString();
}
}
I've been using split and replace but to no avail. After I tried to solve this problem, I see many people using regex, I tried to use regex but I do not get the logic of making a pattern regex.
What should I use to solve this problem?

sometimes regex is not good approach - old school way wins. Assuming valid input:
var tokens = txtpesan.Text.Split(' '); //or use split by regex's whitechar
var prefix = tokens[0].Trim('#');
var result = new StringBuilder();
//skip first and last token
foreach (var token in tokens.Skip(1).Reverse().Skip(1).Reverse())
{
var xIndex = token.IndexOf("x");
var numbers = token.Substring(0, xIndex).Split(',');
var lastNumber = token.Substring(xIndex + 1).Trim('.');
foreach (var num in numbers)
{
result.AppendLine(string.Format("{0},{1},{2}", prefix, num, lastNumber));
}
}
var viola = result.ToString();
Console.WriteLine(viola);

And here comes a somewhat ugly regex based solution:
var q = "#c1 12,34,222x8. 45,989,100x10. 767x55. #c1";
var results = Regex.Matches(q, #"(?:(?:,?\b(\d+))(?:x(\d+))?)+");
var caps = results.Cast<Match>()
.Select(m => m.Groups[1].Captures.Cast<Capture>().Select(cap => cap.Value));
var trailings = results.Cast<Match>().Select(m => m.Groups[2].Value).ToList();
var c1 = q.Split(' ')[0].Substring(1);
var cnt = 0;
foreach (var grp in caps)
{
foreach (var item in grp)
{
Console.WriteLine("{0},{1},{2}", c1, item, trailings[cnt]);
}
cnt++;
}
The regex demo can be seen here. The pattern matches blocks of comma-separated digits while capturing the digits into Group 1, and captures the digits after x into Group 2. Could not get rid of the cnt counter, sorry.

Related

Finding the longest substring regex?

Someone knows how to find the longest substring composed of letters using using MatchCollection.
public static Regex pattern2 = new Regex("[a-zA-Z]");
public static string zad3 = "ala123alama234ijeszczepsa";
You can loop over all matches and get the longest:
string max = "";
foreach (Match match in Regex.Matches(zad3, "[a-zA-Z]+"))
if (max.Length < match.Value.Length)
max = match.Value;
Try this:
MatchCollection matches = pattern2.Matches(txt);
List<string> strLst = new List<string>();
foreach (Match match in matches)
strLst.Add(match.Value);
var maxStr1 = strLst.OrderByDescending(s => s.Length).First();
or better way :
var maxStr2 = matches.Cast<Match>().Select(m => m.Value).ToArray().OrderByDescending(s => s.Length).First();
best solution for your task is:
string zad3 = "ala123alama234ijeszczepsa54dsfd";
string max = Regex.Split(zad3,#"\d+").Max(x => x);
You must change your Regex pattern to include the repetition operator + so that it matches more than once.
[a-zA-Z] should be [a-zA-Z]+
You can get the longest value using LINQ. Order by the match length descending and then take the first entry. If there are no matches the result is null.
string pattern2 = "[a-zA-Z]+";
string zad3 = "ala123alama234ijeszczepsa";
var matches = Regex.Matches(zad3, pattern2);
string result = matches
.Cast<Match>()
.OrderByDescending(x => x.Value.Length)
.FirstOrDefault()?
.Value;
The string named result in this example is:
ijeszczepsa
Using linq and the short one:
string longest= Regex.Matches(zad3, pattern2).Cast<Match>()
.OrderByDescending(x => x.Value.Length).FirstOrDefault()?.Value;
you can find it in O(n) like this (if you do not want to use regex):
string zad3 = "ala123alama234ijeszczepsa";
int max=0;
int count=0;
for (int i=0 ; i<zad3.Length ; i++)
{
if (zad3[i]>='0' && zad3[i]<='9')
{
if (count > max)
max=count;
count=0;
continue;
}
count++;
}
if (count > max)
max=count;
Console.WriteLine(max);

C# Split string into array based on prior character

I need to take a string and split it into an array based on the type of charcter not matching they proceeding it.
So if you have "asd fds 1.4#3" this would split into array as follows
stringArray[0] = "asd";
stringArray[1] = " ";
stringArray[2] = "fds";
stringArray[3] = " ";
stringArray[4] = "1";
stringArray[5] = ".";
stringArray[6] = "4";
stringArray[7] = "#";
stringArray[8] = "3";
Any recomendations on the best way to acheive this? Of course I could create a loop based on .ToCharArray() but was looking for a better way to achieve this.
Thank you
Using a combination of Regular Expressions and link you can do the following.
using System.Text.RegularExpressions;
using System.Linq;
var str="asd fds 1.4#3";
var regex=new Regex("([A-Za-z]+)|([0-9]+)|([.#]+)|(.+?)");
var result=regex.Matches(str).OfType<Match>().Select(x=>x.Value).ToArray();
Add additional capture groups to capture other differences. The last capture (.+?) is a non greedy everything else. So every item in this capture will be considered different (including the same item twice)
Update - new revision of regex
var regex=new Regex(#"(?:[A-Za-z]+)|(?:[0-9]+)|(?:[#.]+)|(?:(?:(.)\1*)+?)");
This now uses non capturing groups so that \1 can be used in the final capture. This means that the same character will be grouped if its in then catch all group.
e.g. before the string "asd fsd" would create 4 strings (each space would be considered different) now the result is 3 strings as 2 adjacent spaces are combined
Use regex:
var mc = Regex.Matches("asd fds 1.4#3", #"([a-zA-Z]+)|.");
var res = new string[mc.Count];
for (var i = 0; i < mc.Count; i++)
{
res[i] = mc[i].Value;
}
This program produces exactly output you want, but I am not sure wether it's generic enaugh for your goal.
class Program
{
private static void Main(string[] args)
{
var splited = Split("asd fds 1.4#3").ToArray();
}
public static IEnumerable<string> Split(string text)
{
StringBuilder result = new StringBuilder();
foreach (var ch in text)
{
if (char.IsLetter(ch))
{
result.Append(ch);
}
else
{
yield return result.ToString();
result.Clear();
yield return ch.ToString(CultureInfo.InvariantCulture);
}
}
}
}

how to split the line in the text file

Text File:
$3.00,0.00,0.00,1.00,L55894M8,$3.00,0.00,0.00,2.00,L55894M9
How do I split the line and get the serial number like L55894M8 and L55894M9?
To get the data that appears after the 4th comma and 9th comma, you would want to do:
var pieces = line.Split(',');
var serial1 = line[3];
var serial2 = line[8];
Edit: Upon further reflection, it appears your file has records that begin with $ and end with the next record. If you want these records, along with the serial number (which appears to be the last field) you can do:
var records = line.TrimStart('$').Split('$');
var recordObjects = records.Select(r => new { Line = r, Serial = r.TrimEnd(',').Split(',').Last() });
In your sample you means want to get the words in index of ( 4 , 9 , 14 .... )
And the five words as a party .
So you can try this way.....
static void Main(string[] args)
{
string strSample = "$3.00,0.00,0.00,1.00,L55894M8,$3.00,0.00,0.00,2.00,L55894M9";
var result = from p in strSample.Split(',').Select((v, i) => new { Index = i, Value = v })
where p.Index % 5 == 4
select p.Value;
foreach (var r in result)
{
Console.WriteLine(r);
}
Console.ReadKey();
}
if the file is in a string you can use the string's .split(',') method then check each element of the resulting array. Or grab every 5th element if that pattern of data is seen throughout.
var str = "$3.00,0.00,0.00,1.00,L55894M8,$3.00,0.00,0.00,2.00,L55894M9";
var fieldArray = str.Split(new[] { ',' });
var serial1 = fieldArray[4]; // "L55894M8"
var serial2 = fieldArray[9]; // "L55894M9"
Try regular expression.
string str = "$3.00,0.00,0.00,1.00,L55894M8,$3.00,0.00,0.00,2.00,L55894M9";
string pat = #"L[\w]+";
MatchCollection ar= Regex.Matches(str, pat);
foreach (var t in ar)
Console.WriteLine(t);

Split string with ' in C#

How to split this string
1014,'0,1031,1032,1034,1035,1036',0,0,1,1,0,1,0,-1,1
and get this string array as result
1014
'0,1031,1032,1034,1035,1036'
0
0
1
1
0
1
0
-1
1
in C#
I believe that this regex should give you what you are looking for:
('(?:[^']|'')*'|[^',\r\n]*)(,|\r\n?|\n)?
http://regexr.com?2vib4
EDIT:
Quick code snippet on how it might work:
var rx = new Regex("('(?:[^']|'')*'|[^',\r\n]*)(,|\r\n?|\n)?");
var text= "1014,'0,1031,1032,1034,1035,1036',0,0,1,1,0,1,0,-1,1";
var matches = rx.Matches(text);
foreach (Match match in matches)
{
System.Console.WriteLine(match.Groups[1].ToString());
}
try this,
string line ="1014,'0,1031,1032,1034,1035,1036',0,0,1,1,0,1,0,-1,1" ;
var values = Regex.Matches(line, "(?:'(?<m>[^']*)')|(?<m>[^,]+)");
foreach (Match value in values) {
Console.WriteLine(value.Groups["m"].Value);
}
This code is not pretty at all, but it works. :) (Does not work with multiple "strings" within the string.)
void Main()
{
string stuff = "1014,'0,1031,1032,1034,1035,1036',0,0,1,1,0,1,0,-1,1";
List<string> newStuff = new List<string>();
var extract = stuff.Substring(stuff.IndexOf('\''), stuff.IndexOf('\'', stuff.IndexOf('\'') + 1) - stuff.IndexOf('\'') + 1);
var oldExtract = extract;
extract = extract.Replace(',',';');
stuff = stuff.Replace(oldExtract, extract);
newStuff.AddRange(stuff.Split(new[] {','}));
var newList = newStuff;
for(var i = 0; i < newList.Count; i++)
newList[i] = newList[i].Replace(';',',');
// And newList will be in the format you specified, but in a list..
}
Firstly split a string on ' (single) quote and then after go for comma (,).
You don't need a parser, you don't need Regex. Here's a pretty simple version that works perfectly:
var splits = input
.Split('\'')
.SelectMany(
(s,i) => (i%2==0)
? s.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries)
: new[]{ "'" + s + "'"}
);
This is exactly what #AVD + #Rawling described ... Split on ', and split only "even" results, then combine.
using System;
using System.IO;
using Microsoft.VisualBasic.FileIO; //Microsoft.VisualBasic.dll
public class Sample {
static void Main(){
string data = "1014,'0,1031,1032,1034,1035,1036',0,0,1,1,0,1,0,-1,1";
string[] fields = null;
data = data.Replace('\'', '"');
using(var csvReader = new TextFieldParser(new StringReader(data))){
csvReader.SetDelimiters(new string[] {","});
csvReader.HasFieldsEnclosedInQuotes = true;
fields = csvReader.ReadFields();
}
foreach(var item in fields){
Console.WriteLine("{0}",item);
}
}
}

Extracting strings in .NET

I have a string that looks like this:
var expression = #"Args("token1") + Args("token2")";
I want to retrieve a collection of strings that are enclosed in Args("") in the expression.
How would I do this in C# or VB.NET?
Regex:
string expression = "Args(\"token1\") + Args(\"token2\")";
Regex r = new Regex("Args\\(\"([^\"]+)\"\\)");
List<string> tokens = new List<string>();
foreach (var match in r.Matches(expression)) {
string s = match.ToString();
int start = s.IndexOf('\"');
int end = s.LastIndexOf('\"');
tokens.add(s.Substring(start + 1, end - start - 1));
}
Non-regex (this assumes that the string in the correct format!):
string expression = "Args(\"token1\") + Args(\"token2\")";
List<string> tokens = new List<string>();
int index;
while (!String.IsNullOrEmpty(expression) && (index = expression.IndexOf("Args(\"")) >= 0) {
int start = expression.IndexOf('\"', index);
string s = expression.Substring(start + 1);
int end = s.IndexOf("\")");
tokens.Add(s.Substring(0, end));
expression = s.Substring(end + 2);
}
There is another regular expression method for accomplishing this, using lookahead and lookbehind assertions:
Regex regex = new Regex("(?<=Args\\(\").*?(?=\"\\))");
string input = "Args(\"token1\") + Args(\"token2\")";
MatchCollection matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match.ToString());
}
This strips away the Args sections of the string, giving just the tokens.
If you want token1 and token2, you can use following regex
input=#"Args(""token1"") + Args(""token2"")"
MatchCollection matches = Regex.Matches(input,#"Args\(""([^""]+)""\)");
Sorry, If this is not what you are looking for.
if your collection looks like this:
IList<String> expression = new List<String> { "token1", "token2" };
var collection = expression.Select(s => Args(s));
As long as Args returns the same type as the queried collection type this should work okay
you can then iterate over the collection like so
foreach (var s in collection)
{
Console.WriteLine(s);
}

Categories