Split a string into three seperate parts - c#

I have a URL string coming into an API e.g. c1:1=25.
*http://mysite/api/controllername?serial=123&c1:=25*
I want to split it into the channel name (c1), the channel reading number (1) after the colon and the value (25).
There are also occasions, where there is no colon as it is a fixed value such as a serial number (serial=123).
I have created a class:
public class UriDataModel
{
public string ChannelName { get; set; }
public string ChannelNumber { get; set; }
public string ChannelValue { get; set; }
}
I am trying to use an IEnumerable with some LINQ and not getting very far.
var querystring = HttpContext.Current.Request.Url.Query;
querystring = querystring.Substring(1);
var urldata = new UrlDataList
{
UrlData = querystring.Split('&').ToList()
};
IEnumerable<UriDataModel> uriData =
from x in urldata.UrlData
let channelname = x.Split(':')
from y in urldata.UrlData
let channelreading = y.Split(':', '=')
from z in urldata.UrlData
let channelvalue = z.Split('=')
select new UriDataModel()
{
ChannelName = channelname[0],
ChannelNumber = channelreading[1],
ChannelValue = channelvalue[2]
};
List<UriDataModel> udm = uriData.ToList();
I feel as if I am over complicating things here.
In summary, I want to split the string into three parts and where there is no colon split it into two.
Any pointers will be great. TIA

You can use regex. I think you switched the channel number and the colon in your example, so my code reflects this assumption.
public static (string channelName, string channelNumber, string channelValue) ParseUrlData(string urlData)
{
var regex = new Regex(#"serial=(\d+)(&c(:\d+)?=(\d+))?");
var matches = regex.Match(urlData);
string name = null;
string number = null;
string value = null;
if (matches.Success)
{
name = matches.Groups[1].Value;
if (matches.Groups.Count == 5) number = matches.Groups[3].Value.TrimStart(':');
if (matches.Groups.Count >= 4) value = matches.Groups[matches.Groups.Count - 1].Value;
}
Console.WriteLine($"[{name}] [{number}] [{value}]");
return (name, number, value);
}
Then you can call it like this
(var channelName, var channelNumber, var channelValue) = ParseUrlData("serial=123&c:1=25");
(var channelName, var channelNumber, var channelValue) = ParseUrlData("serial=123&c=25");
(var channelName, var channelNumber, var channelValue) = ParseUrlData("serial=123");
and it'll return (and print)
[123] [1] [25]
[123] [] [25]
[123] [] []

Related

How to Split and Sum Members of a String Value

I have a database column that is a text field, and this text field contains values that look like
I=5212;A=97920;D=20181121|I=5176;A=77360;D=20181117|I=5087;A=43975;D=20181109
and can vary sometimes to look like:
I=29;A=20009.34;D=20190712;F=300|I=29;A=2259.34;D=20190714;F=300
Where 'I' represents the invoice Id, 'A' the invoice amount, 'D' the date in YYYYMMDD format and 'F' the original foreign currency value if the invoice was from a foreign supplier.
I am fetching that column and binding it to a datagrid which has a button labelled "Show Amount". On button click, it fetches the selected row and splits the string to extract "A"
I need to fetch all the sections with A= within the column result... i.e
A=97920
A=77360
A=43975
Then sum them all together and display the result on a label.
I have tried splitting using '|' first, extracting the substring 'A=' then splitting it using ';' to get the amount after "=".
string cAlloc;
string[] amount;
string InvoiceTotal;
string SupplierAmount;
string BalanceUnpaid;
DataRowView dv = invoicesDataGrid.SelectedItem as DataRowView;
if (dv != null)
{
cAlloc = dv.Row.ItemArray[7].ToString();
InvoiceTotal = dv.Row.ItemArray[6].ToString();
if (invoicesDataGrid.Columns[3].ToString() == "0")
{
lblAmount.Foreground = Brushes.Red;
lblAmount.Content = "No Amount Has Been Paid Out to the Supplier";
}
else
{
amount = cAlloc.Split('|');
foreach (string i in amount)
{
string toBeSearched = "A=";
string code = i.Substring(i.IndexOf(toBeSearched) + toBeSearched.Length);
string[] res = code.Split(';');
SupplierAmount = res[0];
float InvTotIncl = float.Parse(InvoiceTotal, CultureInfo.InvariantCulture.NumberFormat);
float AmountPaid = float.Parse(SupplierAmount, CultureInfo.InvariantCulture.NumberFormat);
float BalUnpaid = InvTotIncl - AmountPaid;
BalanceUnpaid = Convert.ToString(BalUnpaid);
if (BalUnpaid == 0)
{
lblAmount.Content = "Amount Paid = " + SupplierAmount + " No Balance Remaining, Supplier Invoice Paid in Full";
}
else if (BalUnpaid < 0)
{
lblAmount.Content = "Amount Paid = " + SupplierAmount + " Supplier Paid an Excess of " + BalanceUnpaid;
}
else
{
lblAmount.Content = "Amount Paid = " + SupplierAmount + " You Still Owe the Supplier a Total of " + BalanceUnpaid; ;
}
}
}
But I am only able to extract A=43975, the very last "A=". Instead of all three, plus I have not figured out how to sum the strings. Somebody help... please.
Regex is prefered solution. Alternatively split, split and split.
var cAlloc = "I=29;A=20009.34;D=20190712;F=300|I=29;A=2259.34;D=20190714;F=300";
var amount = cAlloc.Split('|');
decimal sum = 0;
foreach (string i in amount)
{
foreach (var t in i.Split(';'))
{
var p = t.Split('=');
if (p[0] == "A")
{
var s = decimal.Parse(p[1], CultureInfo.InvariantCulture);
sum += s;
break;
}
}
}
var in1 = "I=5212;A=97920;D=20181121|I=5176;A=77360;D=20181117|I=5087;A=43975;D=20181109";
var in2 = "I=29;A=20009.34;D=20190712;F=300|I=29;A=2259.34;D=20190714;F=300";
var reg = #"A=(\d+(\.\d+)?)";
Regex.Matches(in1, reg).OfType<Match>().Sum(m => double.Parse(m.Groups[1].Value));
Regex.Matches(in2, reg).OfType<Match>().Sum(m => double.Parse(m.Groups[1].Value));
You're doing too much work for something like this. Here's a simpler solution using Regex.
If the invoice amount is always located as a second value in the set you can access it directly by index after split:
var str = "I=5212;A=97920;D=20181121|I=5176;A=77360;D=20181117|I=5087;A=43975;D=20181109";
var invoices = str.Trim().Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
var totalSum = 0M;
foreach (var invoice in invoices)
{
var invoiceParts = invoice.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
var invoiceAmount = decimal.Parse(invoiceParts[1].Trim().Substring(2));
totalSum += invoiceAmount;
}
Otherwise, you can use a little more "flexible" solution like this:
var str = "I=5212;A=97920;D=20181121|I=5176;A=77360;D=20181117|I=5087;A=43975;D=20181109";
var invoices = str.Trim().Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
var totalSum = 0M;
foreach (var invoice in invoices)
{
var invoiceParts = invoice.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
var invoiceAmount = decimal.Parse(invoiceParts.First(ip => ip.Trim().ToLower().StartsWith("a=")).Substring(2));
totalSum += invoiceAmount;
}
Import the input: "Deserialisation"
With the following given input, we have a list of object with property name I,A, and D.
var input = "I=5212;A=97920;D=20181121|I=5176;A=77360;D=20181117|I=5087;A=43975;D=20181109";
Give this simple class:
public class inputClass
{
public decimal I { get; set; }
public decimal A { get; set; }
public decimal D { get; set; }
}
Parsing it will look like:
var inputItems =
input.Split('|')
.Select(
x =>
x.Split(';')
.ToDictionary(
y => y.Split('=')[0],
y => y.Split('=')[1]
)
)
.Select(
x => //Manual parsing from dictionary to inputClass.
//If dictionary Key match an object property we could use something more generik.
new inputClass
{
I = decimal.Parse(x["I"], CultureInfo.InvariantCulture.NumberFormat),
A = decimal.Parse(x["A"], CultureInfo.InvariantCulture.NumberFormat),
D = decimal.Parse(x["D"], CultureInfo.InvariantCulture.NumberFormat),
}
)
.ToList();
It look complexe? lets give the inputClass the responsability to initialise it self based on string
PropertyName=Value[; PropertyName=Value] :
public inputClass(string input, NumberFormatInfo numberFormat)
{
var dict = input
.Split(';')
.ToDictionary(
y => y.Split('=')[0],
y => y.Split('=')[1]
);
I = decimal.Parse(dict["I"], numberFormat);
A = decimal.Parse(dict["A"], numberFormat);
D = decimal.Parse(dict["D"], numberFormat);
}
Then the parsing is simple:
var inputItems = input.Split('|').Select(x => new inputClass(x, CultureInfo.InvariantCulture.NumberFormat));
Once we have a more useable Structure a List of object We can easly compute Sum, Avg, Max, Min:
var sumA = inputItems.Sum(x => x.A);
Producing the output: "Serialisation"
In order to process the input we will define an object like similar to the Input
public class outputClass
{
public decimal I { get; set; }
public decimal A { get; set; }
public decimal D { get; set; }
public decimal F { get; set; }
The Class should be able to produce the String PropertyName=Value[; PropertyName=Value], :
public override string ToString()
{
return $"I={I};A={A};D={D};F={F}";
}
Then producing and string "serialisation" after computing the ListOutput based on the List input:
//process The input into the output.
var outputItems = new List<outputClass>();
foreach (var item in inputItems)
{
// compute things to be able to create the nex output item
item.A++;
outputItems.Add(
new outputClass { A = item.A, D = item.D, I = item.I, F = 42 }
);
}
// "Serialisation"
var outputString = String.Join("|", outputItems);
Online Demo. https://dotnetfiddle.net/VcEQmf
Long story short:
Define a class with the property you will use/display.
Add a constructor that take a string like "I=5212;A=97920;D=20181121"
nb: the String may contain property that will not be map to the object
Override the ToString(), so It can easly produce it's serialisation.
nb: Property and value that are not stored in the object will not be in the serialisation result.
Now You simply have to split on your line/object separator "|" and you are ready to go using real object, not having to care about that weird string anymore.
PS:
There was a little missunderstand about your 2 type of inputs, I mentally saw them as input, output. Dont mind those name. It can be the same class. It doens't change anything in this answer.

c# use one variable value to set a second from a fixed list

I'm parsing a CSV file in a c# .net windows form app, taking each line into a class I've created, however I only need access to some of the columns AND the files being taken in are not standardized. That is to say, number of fields present could be different and the columns could appear in any column.
CSV Example 1:
Position, LOCATION, TAG, NAME, STANDARD, EFFICIENCY, IN USE,,
1, AFT-D3, P-D3101A, EQUIPMENT 1, A, 3, TRUE
2, AFT-D3, P-D3103A, EQUIPMENT 2, B, 3, FALSE
3, AFT-D3, P-D2301A, EQUIPMENT 3, A, 3, TRUE
...
CSV Example 2:
Position, TAG, STANDARD, NAME, EFFICIENCY, LOCATION, BACKUP, TESTED,,
1, P-D3101A, A, EQUIPMENT 1, 3, AFT-D3, FALSE, TRUE
2, P-D3103A, A, EQUIPMENT 2, 3, AFT-D3, TRUE, FALSE
3, P-D2301A, A, EQUIPMENT 3, 3, AFT-D3, FALSE, TRUE
...
As you can see, I will never know the format of the file I have to analyse, the only thing I know for sure is that it will always contain the few columns that I need.
My solution to this was to ask the user to enter the columns required and set as strings, the using their entry convert that to a corresponding integer that i could then use as a location.
string standardInpt = "";
string nameInpt = "";
string efficiencyInpt = "";
user would then enter a value from A to ZZ.
int standardLocation = 0;
int nameLocation = 0;
int efficiencyLocation = 0;
when the form is submitted. the ints get their final value by running through an if else... statement:
if(standard == "A")
{
standardLocation = 0;
}
else if(standard == "B")
{
standardLocation = 1;
}
...
etc running all the way to if VAR1 == ZZ and then the code is repeated for VAR2 and for VAR3 etc..
My class would partially look like:
class Equipment
{
public string Standard { get; set;}
public string Name { get; set; }
public int Efficiency { get; set; }
static Equipment FromLine(string line)
{
var data = line.split(',');
return new Equipment()
{
Standard = data[standardLocation],
Name = [nameLocation],
Efficiency = int.Parse(data[efficiencyLocation]),
};
}
}
I've got more code in there but i think this highlights where I would use the variables to set the indexes.
I'm very new to this and I'm hoping there has got to be a significantly better way to achieve this without having to write so much potentially excessive, repetitive If Else logic. I'm thinking some kind of lookup table maybe, but i cant figure out how to implement this, any pointers on where i could look?
You could make it automatic by finding the indexes of the columns in the header, and then use them to read the values from the correct place from the rest of the lines:
class EquipmentParser {
public IList<Equipment> Parse(string[] input) {
var result = new List<Equipment>();
var header = input[0].Split(',').Select(t => t.Trim().ToLower()).ToList();
var standardPosition = GetIndexOf(header, "std", "standard", "st");
var namePosition = GetIndexOf(header, "name", "nm");
var efficiencyPosition = GetIndexOf(header, "efficiency", "eff");
foreach (var s in input.Skip(1)) {
var line = s.Split(',');
result.Add(new Equipment {
Standard = line[standardPosition].Trim(),
Name = line[namePosition].Trim(),
Efficiency = int.Parse(line[efficiencyPosition])
});
}
return result;
}
private int GetIndexOf(IList<string> input, params string[] needles) {
return Array.FindIndex(input.ToArray(), needles.Contains);
}
}
You can use the reflection and attribute.
Write your samples in ,separated into DisplayName Attribute.
First call GetIndexes with the csv header string as parameter to get the mapping dictionary of class properties and csv fields.
Then call FromLine with each line and the mapping dictionary you just got.
class Equipment
{
[DisplayName("STND, STANDARD, ST")]
public string Standard { get; set; }
[DisplayName("NAME")]
public string Name { get; set; }
[DisplayName("EFFICIENCY, EFFI")]
public int Efficiency { get; set; }
// You can add any other property
public static Equipment FromLine(string line, Dictionary<PropertyInfo, int> map)
{
var data = line.Split(',').Select(t => t.Trim()).ToArray();
var ret = new Equipment();
Type type = typeof(Equipment);
foreach (PropertyInfo property in type.GetProperties())
{
int index = map[property];
property.SetValue(ret, Convert.ChangeType(data[index],
property.PropertyType));
}
return ret;
}
public static Dictionary<PropertyInfo, int> GetIndexes(string headers)
{
var headerArray = headers.Split(',').Select(t => t.Trim()).ToArray();
Type type = typeof(Equipment);
var ret = new Dictionary<PropertyInfo, int>();
foreach (PropertyInfo property in type.GetProperties())
{
var fieldNames = property.GetCustomAttribute<DisplayNameAttribute>()
.DisplayName.Split(',').Select(t => t.Trim()).ToArray();
for (int i = 0; i < headerArray.Length; ++i)
{
if (!fieldNames.Contains(headerArray[i])) continue;
ret[property] = i;
break;
}
}
return ret;
}
}
try this if helpful:
public int GetIndex(string input)
{
input = input.ToUpper();
char low = input[input.Length - 1];
char? high = input.Length == 2 ? input[0] : (char?)null;
int indexLow = low - 'A';
int? indexHigh = high.HasValue ? high.Value - 'A' : (int?)null;
return (indexHigh.HasValue ? (indexHigh.Value + 1) * 26 : 0) + indexLow;
}
You can use ASCII code for that , so no need to add if else every time
ex.
byte[] ASCIIValues = Encoding.ASCII.GetBytes(standard);
standardLocation = ASCIIValues[0]-65;

How to read file that contains one row with multiple records- C# [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have this text file that only has one row. Each file contains one customer name but multiple items and descriptions.
Record starting with 00 (Company Name) has a char length of 10
01 (Item#) - char length of 10
02 (Description) - char length of 50
I know how to read a file, but I don't have any idea of how to loop through only one line, find records 00, 01, 02 and grab the text based on the length, finally start at the position of the last records and start the loop again. Can someone please give me an idea of how to read files like this?
output:
companyName 16622 Description
companyName 15522 Description
input text file example
00Init 0115522 02Description 0116622 02Description
This solution assumes that the data is fixed width, and that item number will preceed description (01 before 02). This solution will emit a record every time a description record is encountered, and deals with multiple products for the same company.
First, define a class to hold your data:
public class Record
{
public string CompanyName { get; set; }
public string ItemNumber { get; set; }
public string Description { get; set; }
}
Then, iterate through your string, returning a record when you've got a description:
public static IEnumerable<Record> ReadFile(string input)
{
// Alter these as appropriate
const int RECORDTYPELENGTH = 2;
const int COMPANYNAMELENGTH = 41;
const int ITEMNUMBERLENGTH = 8;
const int DESCRIPTIONLENGTH = 48;
int index = 0;
string companyName = null;
string itemNumber = null;
while (index < input.Length)
{
string recordType = input.Substring(index, RECORDTYPELENGTH);
index += RECORDTYPELENGTH;
if (recordType == "00")
{
companyName = input.Substring(index, COMPANYNAMELENGTH).Trim();
index += COMPANYNAMELENGTH;
}
else if (recordType == "01")
{
itemNumber = input.Substring(index, ITEMNUMBERLENGTH).Trim();
index += ITEMNUMBERLENGTH;
}
else if (recordType == "02")
{
string description = input.Substring(index, DESCRIPTIONLENGTH).Trim();
index += DESCRIPTIONLENGTH;
yield return new Record
{
CompanyName = companyName,
ItemNumber = itemNumber,
Description = description
};
}
else
{
throw new FormatException("Unexpected record type " + recordType);
}
}
}
Note that your field lengths in the question don't match the sample data, so I adjusted them so that the solution worked with the data you provided. You can adjust the field lengths by adjusting the constants.
Use this like the following:
string input = "00CompanyName 0115522 02Description 0116622 02Description ";
foreach (var record in ReadFile(input))
{
Console.WriteLine("{0}\t{1}\t{2}", record.CompanyName, record.ItemNumber, record.Description);
}
If you read the whole file into a string, you have a couple options.
One, it might be useful to use string.split.
Another option would be to use string.indexof. Once you have the index, you could use string.substring
Assuming fixed-width as specified, lets create two simple classes to hold a client and its related data as a list:
// can hold as many items (data) as there are in the line
public class Client
{
public string name;
public List<ClientData> data;
};
// one single item in the client data
public class ClientData
{
public string code;
public string description;
};
To parse a single line (which is assumed to have a single client and a successive list of item/description), we can do this (note: for simplification I'm just creating a static class with a static method in it):
// this parser will read as many itens as there are in the line
// and return a Client instance with those inside.
public static class Parser
{
public static Client ParseData(string line)
{
Client client = new Client ();
client.data = new List<ClientData> ();
client.name = line.Substring (2, 10);
// remove the client name
line = line.Substring (12);
while (line.Length > 0)
{
// create new item
ClientData data = new ClientData ();
data.code = line.Substring (2, 10);
data.description = line.Substring (14, 50);
client.data.Add (data);
// next item
line = line.Substring (64);
}
return client;
}
}
So, in your main loop, just after reading a new line from the file, you can call the above method to receive a new client. Something like this:
// should be from a file but this is just an example
string[] lines = {
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
};
// loop through each line
// (lines can have multiple items)
foreach (string line in lines)
{
Client client = Parser.ParseData (line);
Console.WriteLine ("Read: " + client.name);
}
Contents of Sample.txt:
00Company1 0115522 02This is a description for company 1. 00Company2 0115523 02This is a description for company 2. 00Company3 0115524 02This is a description for company 3
Note that in the code below, the fields are 2 characters longer than those specified in the original question. This is because I am including the headings in the length of each field, thus a field of a length of 10is effectively 12 by including the 00 from the heading. If this is undesirable, tweak the offsets of the entries in the fieldLengths array.
String directory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
String file = "Sample.txt";
String path = Path.Combine(directory, file);
Int32[] fieldLengths = new Int32[] { 12, 12, 52 };
List<RowData> rows = new List<RowData>();
Byte[] buffer = new Byte[fieldLengths.Sum()];
using (var stream = File.OpenRead(path))
{
while (stream.Read(buffer, 0, buffer.Length) > 0)
{
List<String> fieldValues = new List<String>();
Int32 offset = 0;
for (int i = 0; i < fieldLengths.Length; i++)
{
var value = Encoding.UTF8.GetString(buffer, offset, fieldLengths[i]);
fieldValues.Add(value);
offset += fieldLengths[i];
}
String companyName = fieldValues[0];
String itemNumber = fieldValues[1];
String description = fieldValues[2];
var row = new RowData(companyName, itemNumber, description);
rows.Add(row);
}
}
Class definition for RowData:
public class RowData
{
public String Company { get; set; }
public String Number { get; set; }
public String Description { get; set; }
public RowData(String company, String number, String description)
{
Company = company;
Number = number;
Description = description;
}
}
The results will be in the rows variable.
You would have to split rows based on a delimiter. It would seem that in your case you are using whitespace as a delimiter.
The method you are looking for is String.Split(), it should cover your needs :) Documentation is located at https://msdn.microsoft.com/en-us/library/system.string.split(v=vs.110).aspx - It also includes examples.
I'd do something like this:
string myLineOfText = "MyCompany 12345 The description of my company";
string[] partsOfMyLine = myLineOfText.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
Best of luck! :)

Extract some values in formatted string

I would like to retrieve values in string formatted like this :
public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;
public var secondValue:String = "";
public var isWorks:Boolean = false;
I want to store field name, type and value in a custom class Property :
public class Property
{
public string Name { get; set; }
public string Type { get; set; }
public string Value { get; set; }
}
And with a Regex expression get these values.
How can I do ?
Thanks
EDIT : I tried this but I don't know how to go further with vectors..etc
/public var ([a-zA-Z0-9]*):([a-zA-Z0-9]*)( = \"?([a-zA-Z0-9]*)\"?)?;/g
Ok, posting my regex-based answer.
Your regex - /public var ([a-zA-Z0-9]*):([a-zA-Z0-9]*)( = \"?([a-zA-Z0-9]*)\"?)?;/g - contains regex delimiters, and they are not supported in C#, and thus are treated as literal symbols. You need to remove them and the modifier g since to obtain multiple matches in C# Regex.Matches, or Regex.Match with while and Match.Success/.NextMatch() can be used.
The regex I am using is (?<=\s*var\s*)(?<name>[^=:\n]+):(?<type>[^;=\n]+)(?:=(?<value>[^;\n]+))?. The newline symbols are included as negated character classes can match a newline character.
var str = "public var any:int = 0;\r\npublic var anyId:Number = 2;\r\npublic var theEnd:Vector.<uint>;\r\npublic var test:Boolean = false;\r\npublic var others1:Vector.<int>;\r\npublic var firstValue:CustomType;\r\npublic var field2:Boolean = false;\r\npublic var secondValue:String = \"\";\r\npublic var isWorks:Boolean = false;";
var rx = new Regex(#"(?<=\s*var\s*)(?<name>[^=:\n]+):(?<type>[^;=\n]+)(?:=(?<value>[^;\n]+))?");
var coll = rx.Matches(str);
var props = new List<Property>();
foreach (Match m in coll)
props.Add(new Property(m.Groups["name"].Value,m.Groups["type"].Value, m.Groups["value"].Value));
foreach (var item in props)
Console.WriteLine("Name = " + item.Name + ", Type = " + item.Type + ", Value = " + item.Value);
Or with LINQ:
var props = rx.Matches(str)
.OfType<Match>()
.Select(m =>
new Property(m.Groups["name"].Value,
m.Groups["type"].Value,
m.Groups["value"].Value))
.ToList();
And the class example:
public class Property
{
public string Name { get; set; }
public string Type { get; set; }
public string Value { get; set; }
public Property()
{}
public Property(string n, string t, string v)
{
this.Name = n;
this.Type = t;
this.Value = v;
}
}
NOTE ON PERFORMANCE:
The regex is not the quickest, but it certainly beats the one in the other answer. Here is a test performed at regexhero.net:
It seems, that you don't want regular expressions; in a simple case
as you've provided:
String text =
#"public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;";
List<Property> result = text
.Split(new Char[] {'\r','\n'}, StringSplitOptions.RemoveEmptyEntries)
.Select(line => {
int varIndex = line.IndexOf("var") + "var".Length;
int columnIndex = line.IndexOf(":") + ":".Length;
int equalsIndex = line.IndexOf("="); // + "=".Length;
// '=' can be absent
equalsIndex = equalsIndex < 0 ? line.Length : equalsIndex + "=".Length;
return new Property() {
Name = line.Substring(varIndex, columnIndex - varIndex - 1).Trim(),
Type = line.Substring(columnIndex, columnIndex - varIndex - 1).Trim(),
Value = line.Substring(equalsIndex).Trim(' ', ';')
};
})
.ToList();
if text can contain comments and other staff, e.g.
"public (*var is commented out*) var sample: int = 123;;;; // another comment"
you have to implement a parser
You can use the following pattern:
\s*(?<vis>\w+?)\s+var\s+(?<name>\w+?)\s*:\s*(?<type>\S+?)(\s*=\s*(?<value>\S+?))?\s*;
to match each element in a line. Appending ? after a quantifier results in a non-greedy match which makes the pattern a lot simpler - no need to negate all unwanted classes.
Values are optional, so the value group is wrapped in another, optional group (\s*=\s*(?<value>\S+?))?
Using the RegexOptions.Multiline option means we don't have to worry about accidentally matching newlines.
The C# 6 syntax in the following example isn't required, but multiline string literals and interpolated strings make for much cleaner code.
var input= #"public var any:int = 0;
public var anyId:Number = 2;
public var theEnd:Vector.<uint>;
public var test:Boolean = false;
public var others1:Vector.<int>;
public var firstValue:CustomType;
public var field2:Boolean = false;
public var secondValue:String = """";
public var isWorks:Boolean = false;";
var pattern= #"\s*(?<vis>\w+?)\s+var\s+(?<name>\w+?)\s*:\s*(?<type>\S+?)(\s*=\s*(?<value>\S+?))?\s*;"
var regex = new Regex(pattern, RegexOptions.Multiline);
var results=regex.Matches(input);
foreach (Match m in results)
{
var g = m.Groups;
Console.WriteLine($"{g["name"],-15} {g["type"],-10} {g["value"],-10}");
}
var properties = (from m in results.OfType<Match>()
let g = m.Groups
select new Property
{
Name = g["name"].Value,
Type = g.["type"].Value,
Value = g["value"].Value
})
.ToList();
I would consider using a parser generator like ANTLR though, if I had to parse more complex input or if there are multiple patterns to match. Learning how to write the grammar takes some time, but once you learn it, it's easy to create parsers that can match input that would require very complicated regular expressions. Whitespace management also becomes a lot easier
In this case, the grammar could be something like:
property : visibility var name COLON type (EQUALS value)? SEMICOLON;
visibility : ALPHA+;
var : ALPHA ALPHA ALPHA;
name : ALPHANUM+;
type : (ALPHANUM|DOT|LEFT|RIGHT);
value : ALPHANUM
| literal;
literal : DOUBLE_QUOTE ALPHANUM* DOUBLE_QUOTE;
ALPHANUM : ALPHA
| DIGIT;
ALPHA : [A-Z][a-z];
DIGIT : [0-9];
...
WS : [\r\n\s] -> skip;
With a parser, adding eg comments would be as simple as adding comment before SEMICOLON in the property rule and a new comment rule that would match the pattern of a comment

Split string into class

I have an array which contains following values:
str[0]= "MeterNr 29202"
str[1]="- 20111101: position 61699 (Previous calculation) "
str[2]="- 20111201: position 68590 (Calculation) consumption 6891 kWh"
str[3]="- 20111101: position 75019 (Previous calculation) "
str[4]="MeterNr 50273"
str[5]="- 20111101: position 18103 (Previous reading) "
str[6]="- 20111201: position 19072 (Calculation) consumption 969 kWh "
I want to split the rows in logical order so that I can store them in following Reading class. I have problems with spliting the values. Everything in brackets () is ItemDescription.
I will be thankful for the quick answer.
public class Reading
{
public string MeterNr { get; set; }
public string ItemDescription { get; set; }
public string Date { get; set; }
public string Position { get; set; }
public string Consumption { get; set; }
}
You should parse the values one by one.
If you have a string, which starts with "MeterNr", you should save it as currentMeterNumber and parse the values further.
Otherwise, you can parse the values with Regex:
var dateRegex = new Regex(#"(?<=-\s)(?<year>\d{4})(?<month>\d{2})(?<day>\d{2})");
var positionRegex = new Regex(#"(?<=position\s+)(\d+)");
var descriptionRegex = new Regex(#"(?<=\()(?<description>[^)]+)(?=\))");
var consuptionRegex = new Regex(#"(?<=consumption\s+)(?<consumption>(?<consumtionValue>\d+)\s(?<consumptionUom>\w+))");
I hope, you would be able to create the final algorithm, as well as understand how each of those expressions works. A final point could be to combine them all into single Regex. You should do it yourself to enhance your skills.
P.S.: There are a lot of tutorials in Internet.
I would just use a for loop and string indexes etc, but then I am a bit simple like that! Not sure of your data (i.e. if things might be missing) but this would work on the data you have posted...
var readings = new List<Reading>();
int meterNrLength = "MeterNr".Length;
int positionLength = "position".Length;
int consumptionLength = "consumption".Length;
string meterNr = null;
foreach(var s in str)
{
int meterNrIndex = s.IndexOf("MeterNr",
StringComparison.OrdinalIgnoreCase);
if (meterNrIndex != -1)
{
meterNr = s.Substring(meterNrIndex + meterNrLength).Trim();
continue;
}
var reading = new Reading {MeterNr = meterNr};
string rest = s.Substring(0, s.IndexOf(':'));
reading.Date = rest.Substring(1).Trim();
rest = s.Substring(s.IndexOf("position") + positionLength);
int bracketIndex = rest.IndexOf('(');
reading.Position = rest.Substring(0, bracketIndex).Trim();
rest = rest.Substring(bracketIndex + 1);
reading.ItemDescription = rest.Substring(0, rest.IndexOf(")"));
int consumptionIndex = rest.IndexOf("consumption",
StringComparison.OrdinalIgnoreCase);
if (consumptionIndex != -1)
{
reading.Consumption = rest.Substring(consumptionIndex + consumptionLength).Trim();
}
readings.Add(reading);
}
public static List<Reading> Parser(this string[] str)
{
List<Reading> result = new List<Reading>();
string meterNr = "";
Reading reading;
foreach (string s in str)
{
MatchCollection mc = Regex.Matches(s, "\\d+|\\((.*?)\\)");
if (mc.Count == 1)
{
meterNr = mc[0].Value;
continue;
}
reading = new Reading()
{
MeterNr = meterNr,
Date = mc[0].Value,
Position = mc[1].Value,
ItemDescription = mc[2].Value.TrimStart('(').TrimEnd(')')
};
if (mc.Count == 4)
reading.Consumption = mc[3].Value;
result.Add(reading);
}
return result;
}

Categories