Pattern based string parse - c#

When I need to stringify some values by joining them with commas, I do, for example:
string.Format("{0},{1},{3}", item.Id, item.Name, item.Count);
And have, for example, "12,Apple,20".
Then I want to do opposite operation, get values from given string. Something like:
parseFromString(str, out item.Id, out item.Name, out item.Count);
I know, it is possible in C. But I don't know such function in C#.

Yes, this is easy enough. You just use the String.Split method to split the string on every comma.
For example:
string myString = "12,Apple,20";
string[] subStrings = myString.Split(',');
foreach (string str in subStrings)
{
Console.WriteLine(str);
}

Possible implementations would use String.Split or Regex.Match
example.
public void parseFromString(string input, out int id, out string name, out int count)
{
var split = input.Split(',');
if(split.length == 3) // perhaps more validation here
{
id = int.Parse(split[0]);
name = split[1];
count = int.Parse(split[2]);
}
}
or
public void parseFromString(string input, out int id, out string name, out int count)
{
var r = new Regex(#"(\d+),(\w+),(\d+)", RegexOptions.IgnoreCase);
var match = r.Match(input);
if(match.Success)
{
id = int.Parse(match.Groups[1].Value);
name = match.Groups[2].Value;
count = int.Parse(match.Groups[3].Value);
}
}
Edit: Finally, SO has a bunch of thread on scanf implementation in C#
Looking for C# equivalent of scanf
how do I do sscanf in c#

If you can assume the strings format, especially that item.Name does not contain a ,
void parseFromString(string str, out int id, out string name, out int count)
{
string[] parts = str.split(',');
id = int.Parse(parts[0]);
name = parts[1];
count = int.Parse(parts[2]);
}
This will simply do what you want but I would suggest you add some error checking. Better still consider serializing/deserializing to XML or JSON.

Use Split function
var result = "12,Apple,20".Split(',');

Related

How to parse this single line Console input most efficiently

I am trying to get the input from the user in a single Line with with [, ,] separators. Like this:
[Q,W,1] [R,T,3] [Y,U,9]
And then I will use these inputs in a function like this:
f.MyFunction('Q','W',1); // Third parameter will be taken as integer
f.MyFunction('R','T',3);
f.MyFunction('Y','U',9);
I thought I could do sth like:
string input = Console.ReadLine();
string input1 = input.Split(' ')[0];
char input2 = input.Trim(',') [0];
But it seems to repeat a lot.
What would be the most logical way to do this?
Sometimes a regular expression really is the best tool for the job. Use a pattern that matches the input pattern and use Regex.Matches to extract all the possible inputs:
var funcArgRE = new Regex(#"\[(.),(.),(\d+)\]", RegexOptions.Compiled);
foreach (Match match in funcArgRE.Matches(input)) {
var g = match.Groups;
f.MyFunction(g[1].Value[0], g[2].Value[0], Int32.Parse(g[3].Value));
}
Well, you could use LinQ to objects functions and do something like this:
var inputs = input.Split(' ')
.Select(x =>
x.Replace("[", "")
.Replace("]", ""))
.Select(x => new UserInput(x))
.ToList();
foreach(var userInput in inputs)
{
f.MyFunction(userInput.A, userInput.B, userInput.Number);
}
// Somewhere else
public record UserInput
{
public UserInput(string input)
{
//Do some kind of validation here and throw exception accordingly
var parts = input.Split(',');
A = parts[0][0];
B = parts[1][0];
Number = Convert.ToInt32(parts[2]);
}
public char A { get; init; }
public char B { get; init; }
public int Number { get; init; }
};
Or you could go further and implement "operator overloading" for the UserInput record and make it possible to implicitly convert from string to UserInput

Parse a string in c# after reading first and last alphabet

I want to cut a string in c# after reading first and last alphabet.
string name = "20150910000549659ABCD000007348summary.pdf";
string result = "ABCD000007348"; // Something like this
string name = "1234 ABCD000007348 summary.pdf";
After reading 1234 "A" comes and at last "s" comes so I want "ABCD000007348"
Simply use Regex:
string CutString(string input)
{
Match result = Regex.Match(input, #"[a-zA-Z]+[0-9]+");
return result.Value;
}
Since you didn't say if it's always a timestamp at the beginning, I've instead opted to iterate over the string to find the first alphabetical character, rather than hardcoding s.Remove(0, n); where n is however many digits are in a timestamp.
string s = "20150910000549659ABCD000007348summary.pdf";
s = s.Replace("summary.pdf", String.Empty);
int firstLetter = 0;
foreach (char c in s)
{
if (Char.IsLetter(c))
{
firstLetter = s.IndexOf(c);
break;
}
}
s = s.Remove(0, firstLetter);

C# Reading particular values in a string

I have the following line from a string:
colors numResults="100" totalResults="6806926"
I want to extract the value 6806926 from the above string
How is it possible?
So far, I have used StringReader to read the entire string line by line.
Then what should I do?
I'm sure there's also a regex, but this string approach should work also:
string xmlLine = "[<colors numResults=\"100\" totalResults=\"6806926\">]";
string pattern = "totalResults=\"";
int startIndex = xmlLine.IndexOf(pattern);
if(startIndex >= 0)
{
startIndex += pattern.Length;
int endIndex = xmlLine.IndexOf("\"", startIndex);
if(endIndex >= 0)
{
string token = xmlLine.Substring(startIndex,endIndex - startIndex);
// if you want to calculate with it
int totalResults = int.Parse( token );
}
}
Demo
Consider the this is in Mytext of string type variable
now
Mytext.Substring(Mytext.indexof("totalResults="),7);
//the function indexof will return the point wheres the value start,
//and 7 is a length of charactors that you want to extract
I am using similar of this ........
You can read with Linq2Xml, numResults and totalResults are Attributes, and <colors numResults="100" totalResults="6806926"> is Element, so you can simply get it by nmyXmlElement.Attributes("totalResults").
This function will split the string into a list of key value pairs which you can then pull out whatever you require
static List<KeyValuePair<string, string>> getItems(string s)
{
var retVal = new List<KeyValuePair<String, string>>();
var items = s.Split(' ');
foreach (var item in items.Where(x => x.Contains("=")))
{
retVal.Add(new KeyValuePair<string, string>( item.Split('=')[0], item.Split('=')[1].Replace("\"", "") ));
}
return retVal;
}
You can use regular expressions:
string input = "colors numResults=\"100\" totalResults=\"6806926\"";
string pattern = "totalResults=\"(?<results>\\d+?)\"";
Match result = new Regex(pattern).Match(input);
Console.WriteLine(result.Groups["results"]);
Be sure to have this included:
using System.Text.RegularExpressions;

Extracting parts of a string c#

In C# what would be the best way of splitting this sort of string?
%%x%%a,b,c,d
So that I end up with the value between the %% AND another variable containing everything right of the second %%
i.e. var x = "x"; var y = "a,b,c,d"
Where a,b,c.. could be an infinite comma seperated list. I need to extract the list and the value between the two double-percentage signs.
(To combat the infinite part, I thought perhaps seperating the string out to: %%x%% and a,b,c,d. At this point I can just use something like this to get X.
var tag = "%%";
var startTag = tag;
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf(tag, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
Would the best approach be to use regex or use lots of indexOf and substring to do the extracting based on te static %% characters?
Given that what you want is "x,a,b,c,d" the Split() function is actually pretty powerful and regex would be overkill for this.
Here's an example:
string test = "%%x%%a,b,c,d";
string[] result = test.Split(new char[] { '%', ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in result) {
Console.WriteLine(s);
}
Basicly we ask it to split by both '%' and ',' and ignore empty results (eg. the result between "%%"). Here's the result:
x
a
b
c
d
To Extract X:
If %% is always at the start then;
string s = "%%x%%a,b,c,d,h";
s = s.Substring(2,s.LastIndexOf("%%")-2);
//Console.WriteLine(s);
Else;
string s = "v,u,m,n,%%x%%a,b,c,d,h";
s = s.Substring(s.IndexOf("%%")+2,s.LastIndexOf("%%")-s.IndexOf("%%")-2);
//Console.WriteLine(s);
If you need to get them all at once then use this;
string s = "m,n,%%x%%a,b,c,d";
var myList = s.ToArray()
.Where(c=> (c != '%' && c!=','))
.Select(c=>c).ToList();
This'll let you do it all in one go:
string pattern = "^%%(.+?)%%(?:(.+?)(?:,|$))*$";
string input = "%%x%%a,b,c,d";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
// "x"
string first = match.Groups[1].Value;
// { "a", "b", "c", "d" }
string[] repeated = match.Groups[2].Captures.Cast<Capture>()
.Select(c => c.Value).ToArray();
}
You can use the char.IsLetter to get all the list of letter
string test = "%%x%%a,b,c,d";
var l = test.Where(c => char.IsLetter(c)).ToArray();
var output = string.Join(", ", l.OrderBy(c => c));
Since you want the value between the %% and everything after in separate variables and you don't need to parse the CSV, I think a RegEx solution would be your best choice.
var inputString = #"%%x%%a,b,c,d";
var regExPattern = #"^%%(?<x>.+)%%(?<csv>.+)$";
var match = Regex.Match(inputString, regExPattern);
foreach (var item in match.Groups)
{
Console.WriteLine(item);
}
The pattern has 2 named groups called x and csv, so rather than just looping, you can easily reference them by name and assign them to values:
var x = match.Groups["x"];
var y = match.Groups["csv"];

formatting string in MVC /C#

I have a string 731478718861993983 and I want to get this 73-1478-7188-6199-3983 using C#. How can I format it like this ?
Thanks.
By using regex:
public static string FormatTest1(string num)
{
string formatPattern = #"(\d{2})(\d{4})(\d{4})(\d{4})(\d{4})";
return Regex.Replace(num, formatPattern, "$1-$2-$3-$4-$5");
}
// test
string test = FormatTest1("731478718861993983");
// test result: 73-1478-7188-6199-3983
If you're dealing with a long number, you can use a NumberFormatInfo to format it:
First, define your NumberFormatInfo (you may want additional parameters, these are the basic 3):
NumberFormatInfo format = new NumberFormatInfo();
format.NumberGroupSeparator = "-";
format.NumberGroupSizes = new[] { 4 };
format.NumberDecimalDigits = 0;
Next, you can use it on your numbers:
long number = 731478718861993983;
string formatted = number.ToString("n", format);
Console.WriteLine(formatted);
After all, .Net has very good globalization support - you're better served using it!
string s = "731478718861993983"
var newString = (string.Format("{0:##-####-####-####-####}", Convert.ToInt64(s));
LINQ-only one-liner:
var str = "731478718861993983";
var result =
new string(
str.ToCharArray().
Reverse(). // So that it will go over string right-to-left
Select((c, i) => new { #char = c, group = i / 4}). // Keep group number
Reverse(). // Restore original order
GroupBy(t => t.group). // Now do the actual grouping
Aggregate("", (s, grouping) => "-" + new string(
grouping.
Select(gr => gr.#char).
ToArray())).
ToArray()).
Trim('-');
This can handle strings of arbitrary lenghs.
Simple (and naive) extension method :
class Program
{
static void Main(string[] args)
{
Console.WriteLine("731478718861993983".InsertChar("-", 4));
}
}
static class Ext
{
public static string InsertChar(this string str, string c, int i)
{
for (int j = str.Length - i; j >= 0; j -= i)
{
str = str.Insert(j, c);
}
return str;
}
}
If you're dealing strictly with a string, you can make a simple Regex.Replace, to capture each group of 4 digits:
string str = "731478718861993983";
str = Regex.Replace(str, "(?!^).{4}", "-$0" ,RegexOptions.RightToLeft);
Console.WriteLine(str);
Note the use of RegexOptions.RightToLeft, to start capturing from the right (so "12345" will be replaced to 1-2345, and not -12345), and the use of (?!^) to avoid adding a dash in the beginning.
You may want to capture only digits - a possible pattern then may be #"\B\d{4}".
string myString = 731478718861993983;
myString.Insert(2,"-");
myString.Insert(7,"-");
myString.Insert(13,"-");
myString.Insert(18,"-");
My first thought is:
String s = "731478718861993983";
s = s.Insert(3,"-");
s = s.Insert(8,"-");
s = s.Insert(13,"-");
s = s.Insert(18,"-");
(don't remember if index is zero-based, in which case you should use my values -1)
but there is probably some easier way to do this...
If the position of "-" is always the same then you can try
string s = "731478718861993983";
s = s.Insert(2, "-");
s = s.Insert(7, "-");
s = s.Insert(12, "-");
s = s.Insert(17, "-");
Here's how I'd do it; it'll only work if you're storing the numbers as something which isn't a string as they're not able to be used with format strings.
string numbers = "731478718861993983";
string formattedNumbers = String.Format("{0:##-####-####-####-####}", long.Parse(numbers));
Edit: amended code, since you said they were held as a string in your your original question

Categories