Get particular field or characters from string line in C# - c#

I have one file and read file line by line and extract particular object from string line.
for example string line is in two format.
VA001748714600006640126132202STRONG 4P 4X44G000099
VA 00174 871460000664 012 6132202 STRONG 4P 4X44G 000099
now i need to extract string and store into my table and fields like below and above two line data generate in below fields(Desire Results).
Code Location SerialNo Quantity ItemNo Description Price
VA 00174 871460000664 12 6132202 STRONG 4P 4X44G0 000099
what i have tried. i have created one method that return object[] extract from string
public static object[] ProcessLine(string line)
{
var obj = new object[7];
var str = line.Replace("\0", "").Replace(" ", "");
string code = str.Substring(0, 2)?.Trim();
string location = str.Substring(2, 5)?.Trim();
string serialNo = str.Substring(7, 12)?.Trim();
string quantity = str.Substring(19, 3)?.Trim();
int qty = 0;
if (!string.IsNullOrEmpty(quantity))
{
qty = Convert.ToInt32(quantity);
}
string itemNo = str.Substring(22, 7)?.Trim();
Regex MyRegex = new Regex("[^a-z ]", RegexOptions.IgnoreCase);
string description = MyRegex.Replace(line.Substring(2), #"")?.Trim();
string price = str.Substring(str.Length - 6)?.Trim();
obj.SetValue(code, 0);
obj.SetValue(location, 1);
obj.SetValue(serialNo, 2);
obj.SetValue(qty, 3);
obj.SetValue(itemNo, 4);
obj.SetValue(description, 5);
obj.SetValue(price, 6);
return obj;
}
i have find sub-string and store into object, also i can't find Description because this field is not fixed letters.
(Code,Location,SerialNo,Quantity,ItemNo and Price) are fixed no.of characters and (Description) fields are any characters or changes.
how to find this fields value and description using regex i tried to find description but it extract without digit.

If you really want to use a regex, see Wiktor's answer.
However, you don't need a regex for this problem.
Since all fields except description have known lengths, you can calculate the length of the description field. From your specs the description starts at position 29, and is followed by 6 positions for the price field. Therefore, this should give you the description:
string description = str.Substring(29, str.Length-29-6);

You may declare a regex like
private static readonly Regex rx = new Regex(#"^(\w{2})\s*(\w{5})\s*(\w{12})\s*(\d{3})\s*(\d{7})\s*(.*?)\s*(\d{6})$", RegexOptions.Compiled);
See the regex demo.
The point is to use a regex that matches a whole string (^ match the start of a string and $ matches the end of the string), use \w (any letter/digit/_ chars) or \d (any digit char), {m} quantifier to match a certain amount of the chars matched with \w or \d, match the Description field with .*?, a lazy dot pattern that matches any 0+ chars other than newline as few as possible, and allow any 0+ whitespace chars in between fields with \s*.
Then, you may use it
public static object[] ProcessLine(string line)
{
object[] obj = null;
var m = rx.Match(line);
if (m.Success)
{
obj = new object[] {
m.Groups[1].Value,
m.Groups[2].Value,
m.Groups[3].Value,
int.Parse(m.Groups[4].Value).ToString(), // remove leading zeros
m.Groups[5].Value,
m.Groups[6].Value,
m.Groups[7].Value
};
}
return obj;
}
See the C# demo, demo output for both the strings in OP:
VA, 00174, 871460000664, 12, 6132202, KING PEPERM E STRONG 4P 4X44G, 000099
VA, 00174, 871460000664, 12, 6132202, KING PEPERM E STRONG 4P 4X44G, 000099

Related

Regex to find all placeholder occurrences in text

Im struggling to create a Regex that finds all placeholder occurrences in a given text. Placeholders will have the following format:
[{PRE.Word1.Word2}]
Rules:
Delimited by "[{PRE." and "}]" ("PRE" upper case)
2 words (at least 1 char long each) separated by a dot. All chars valid on each word apart from newline.
word1: min 1 char, max 15 chars
word2: min 1 char, max 64 chars
word1 cannot have dots, if there are more than 2 dots inside placeholder extra ones will be part of word2. If less than 2 dots, placeholder is invalid.
Looking to get all valid placeholders regardless of what the 2 words are.
Im not being lazy, just spent an horrible amount of time building the rule on regexr.com, but was unable to cross all these rules.
Looking fwd to checking your suggestions.
The closest I've got to was the below, and any attempt to expand on that breaks all valid matches.
\[\{OEP\.*\.*\}\]
Much appreciated!
Sample text where Regex should find matches:
Random text here
[{Test}] -- NO MATCH
[{PRE.TestTest3}] --NO MATCH
[{PRE.TooLong.12345678901234567890}] --NO MATCH
[{PRE.Address.Country}] --MATCH
[{PRE.Version.1.0}] --MATCH
Random text here
You can use
\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]
See the regex demo
Details
\[{ - a [{ string
PRE\. - PRE. text
([^][{}.]{1,15}) - Group 1: any one to fifteen chars other than [, ], {, } and .
\. - a dot
(.{1,64}?) - any one to 64 chars other than line break chars as few as possible
}] - a }] text.
If you need to get all matches in C#, you can use
var pattern = #"\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]";
var matches = Regex.Matches(text, pattern);
See this C# demo:
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var text = "[{PRE.Word1.Word2}] and [{PRE.Word 3.Word..... 2 %%%}]";
var pattern = #"\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]";
var matches = Regex.Matches(text, pattern);
var props = new List<Property>();
foreach (Match m in matches)
props.Add(new Property(m.Groups[1].Value,m.Groups[2].Value));
foreach (var item in props)
Console.WriteLine("Word1 = " + item.Word1 + ", Word2 = " + item.Word2);
}
public class Property
{
public string Word1 { get; set; }
public string Word2 { get; set; }
public Property()
{}
public Property(string w1, string w2)
{
this.Word1 = w1;
this.Word2 = w2;
}
}
}
Output:
Word1 = Word1, Word2 = Word2
Word1 = Word 3, Word2 = Word..... 2 %%%
string input = "[{PRE.Word1.Word2}]";
// language=regex
string pattern = #"\[{ PRE \. (?'group1' .{1,15}? ) \. (?'group2' .{1,64}? ) }]";
var match = Regex.Match(input, pattern, RegexOptions.IgnorePatternWhitespace);
Console.WriteLine(match.Groups["group1"].Value);
Console.WriteLine(match.Groups["group2"].Value);

How do I Split the string into two separate variables in C#

I have a string that I want to store in two different varaibles in C#.
s= "Name=team1; ObjectGUID=d8fd5125-b065-48cb-b5f3-c20f509b7476"
I want Var1 = team1 & Var2 = d8fd5125-b065-48cb-b5f3-c20f509b7476
Here's what I am trying to do:
var1 = s.Replace("Name=","").Replace("; ObjectGUID=", "");
But I am not able to figure out how to bifurcate the Name value to var1 and eliminate the rest. And it is possible that the value of 'Name' could vary so I can't fix the length to chop off.
You could use a regex where the value of Name could be captured in group 1 matching not a ; using a negated character class.
The value of ObjectGUID could be captured in group 2 using a repeated pattern matching 1+ times a digit 0-9 or characters a-f. Then repeat that pattern 1+ times preceded with a -
Name=([^;]+); ObjectGUID=([a-f0-9]+(?:-[a-f0-9]+)+)
.NET regex demo | C# demo
For example:
string pattern = #"Name=([^;]+); ObjectGUID=([a-f0-9]+(?:-[a-f0-9]+)+)";
string s= "Name=team1; ObjectGUID=d8fd5125-b065-48cb-b5f3-c20f509b7476";
Match m = Regex.Match(s, pattern);
string var1 = m.Groups[1].Value;
string var2 = m.Groups[2].Value;
Console.WriteLine(var1);
Console.WriteLine(var2);
Result
team1
d8fd5125-b065-48cb-b5f3-c20f509b7476
Split by ';' then split by '='. Also works for any key/value pairs such as the ones in connection strings.
var values = s.Split(';').Select(kv => kv.Split('=')[1]).ToArray();
var var1 = values[0];
var val2 = values[1];
You can use IndexOf to take point at "=" and Substring to take the next value.
using System;
public class SubStringTest {
public static void Main() {
string [] info = { "Name: Felica Walker", "Title: Mz.",
"Age: 47", "Location: Paris", "Gender: F"};
int found = 0;
Console.WriteLine("The initial values in the array are:");
foreach (string s in info)
Console.WriteLine(s);
Console.WriteLine("\nWe want to retrieve only the key information. That
is:");
foreach (string s in info) {
found = s.IndexOf(": ");
Console.WriteLine(" {0}", s.Substring(found + 2));
}
}
}
The example displays the following output:
The initial values in the array are:
Name: Felica Walker
Title: Mz.
Age: 47
Location: Paris
Gender: F
We want to retrieve only the key information. That is:
Felica Walker
Mz.
47
Paris
F

Get particular parts from a string

I'm trying to get particular parts from a string. I have to get the part which starts after '#' and contains only letters from the Latin alphabet.
I suppose that I have to create a regex pattern, but I don't know how.
string test = "PQ#Alderaa1:30000!A!->20000";
var planet = "Alderaa"; //what I want to get
string test2 = "#Cantonica:3000!D!->4000NM";
var planet2 = "Cantonica";
There are some other parts which I have to get, but I will try to get them myself. (starts after ':' and is an Integer; may be "A" (attack) or "D" (destruction) and must be surrounded by "!" (exclamation mark); starts after "->" and should be an Integer)
You could get the separate parts using capturing groups:
#([a-zA-Z]+)[^:]*:(\d+)!([AD])!->(\d+)
That will match:
#([a-zA-Z]+) Match # and capture in group 1 1+ times a-zA-Z
[^:]*: Match 0+ times not a : using a negated character class, then match a : (If what follows could be only optional digits, you might also match 0+ times a digit [0-9]*)
(\d+) Capture in group 2 1+ digits
!([AD])! Match !, capture in group 3 and A or D, then match !
->(\d+) Match -> and capture in group 4 1+ digits
Demo | C# Demo
You can use this regex, which uses a positive look behind to ensure the matched text is preceded by # and one or more alphabets get captured using [a-zA-Z]+ and uses a positive look ahead to ensure it is followed by some optional text, a colon, then one or more digits followed by ! then either A or D then again a !
(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)
Demo
C# code demo
string test = "PQ#Alderaa1:30000!A!->20000";
Match m1 = Regex.Match(test, #"(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)");
Console.WriteLine(m1.Groups[0].Value);
test = "#Cantonica:3000!D!";
m1 = Regex.Match(test, #"(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)");
Console.WriteLine(m1.Groups[0].Value);
Prints,
Alderaa
Cantonica
You already have a good answers but I would like to add a new one to show named capturing groups.
You can create a class for your planets like
class Planet
{
public string Name;
public int Value1; // name is not cleat from context
public string Category; // as above: rename it
public string Value2; // same problem
}
Now you can use regex with named groups
#(?<name>[a-z]+)[^:]*:(?<value1>\d+)!(?<category>[^!]+)!->(?<value2>[\da-z]+)
Demo
Usage:
var input = new[]
{
"PQ#Alderaa1:30000!A!->20000",
"#Cantonica:3000!D!->4000NM",
};
var regex = new Regex("#(?<name>[a-z]+)[^:]*:(?<value1>\\d+)!(?<category>[^!]+)!->(?<value2>[\\da-z]+)",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
var planets = input
.Select(p => regex.Match(p))
.Select(m => new Planet
{
Name = m.Groups["name"].Value, // here and further we can access to part of input string by name
Value1 = int.Parse(m.Groups["value1"].Value),
Category = m.Groups["category"].Value,
Value2 = m.Groups["value2"].Value
})
.ToList();

How to match a specific sentence with Regex

I'm new to Regex and I couldn't cope with matching this sort of sentence: Band Name #Venue 30 450, where the digits at the end represent price and quantity.
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, #"^[A-Za-z]+\s+[A-Za-z]+\s+[\d+]+\s+[\d+]$");
if (m.Success)
{
Console.WriteLine("Success!");
}
You can use Regex and leverage usage of named groups. This will make easier to extract data later if you need them. Example is:
string pattern = #"(Band) (?<Band>[A-Za-z ]+) (?<City>#[A-Za-z ]+) (?<Price>\d+) (?<Quantity>\d+)";
string input = "Band Name #City 25 3500";
Match match = Regex.Match(input, pattern);
Console.WriteLine(match.Groups["Band"].Value);
Console.WriteLine(match.Groups["City"].Value.TrimStart('#'));
Console.WriteLine(match.Groups["Price"].Value);
Console.WriteLine(match.Groups["Quantity"].Value);
If you looked at the pattern there are few regex groups which are named ?<GroupName>. It is just a basic example which can be tweaked as well to fulfill you actual needs.
This one should work:
[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+
Can test it here.
With your code it'd be:
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, "[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+");
if (m.Success)
{
Console.WriteLine("Success!");
}
Here is a very old and elaborated way : 1st way
string re1=".*?"; // Here the part before #
string re2="(#)"; // Any Single Character 1
string re3="((?:[a-z][a-z]+))"; // Word 1, here city
string re4="(\\s+)"; // White Space 1
string re5="(\\d+)"; // Integer Number 1, here 25
string re6="(\\s+)"; // White Space 2
string re7="(\\d+)"; // Integer Number 2, here 3500
Regex r = new Regex(re1+re2+re3+re4+re5+re6+re7,RegexOptions.IgnoreCase|RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String c1=m.Groups[1].ToString();
String word1=m.Groups[2].ToString();
String ws1=m.Groups[3].ToString();
String int1=m.Groups[4].ToString();
String ws2=m.Groups[5].ToString();
String int2=m.Groups[6].ToString();
Console.Write("("+c1.ToString()+")"+"("+word1.ToString()+")"+"("+ws1.ToString()+")"+"("+int1.ToString()+")"+"("+ws2.ToString()+")"+"("+int2.ToString()+")"+"\n");
}
In the above way you can store the specific values at a time. Like in your group[6] there is 3500 or what value in this format.
you can create your own regex here : Regex
And in a short, others given answers are right. 2nd way
just create the regex with
"([A-Za-z ]+) ([A-Za-z ]+) #([A-Za-z ]+) (\d+) (\d+)"
And match with any string format. you can create you won regex and test here: Regex Tester
That is the answer to what I was trying to do:
string input = "Band Name #Location 25 3500";
Match m = Regex.Match(input, #"([A-Za-z ]+) (#[A-Za-z ]+) (\d+) (\d+)");
if (m.Success)
{
Console.WriteLine("Success!");
}

c#: how to match a variable string followed by a space and currency and pull the currency?

I'm trying to match the following cases and pull the number value:
"b 30.00"
"bill 30.00"
"bill 30"
"b 30"
I've tried:
var regex = new Regex("^b(?-i:ill)?$ ^$?d+(.d{2})?$", RegexOptions.IgnoreCase);
However, this doesn't seem to return a match, and I'm not sure how to pull the digit.
You haven't well understand how to use anchors ^ & $, read about this.
var regex = new Regex(#"^[Bb](?:ill)? \d+(?:\.\d{2})?$");
or better since you only need ascii digits (and not all possible digits of the world):
var regex = new Regex(#"^[Bb](?:ill)? [0-9]+(?:\.[0-9]{2})?$");
If you want to figure a literal . you must escape it (same thing for a literal $). Note the use of a verbatim string to avoid double backslashes.
Feel free to add capture groups around what you want to capture.
You didn't mention if RegEx is actually required to accomplish your goal. If RegEx is not required, and you know that your string is in a specific format, you could just split the string:
string val = "bill 30.00";
string[] split = val.Split(' ');
string name = string.Empty;
decimal currency = 0m;
if (split.Length > 1)
{
name = split[0];
decimal.TryParse(split[1], out currency);
}
new Regex (#"\b\d+(.\d {2})*") should give you what you want
Just try the code
string Value = "bill 30.00";
string resultString = Regex.Match(Value, #"\d+").Value;

Categories