RegularExpressions with C# - c#

How can I use Regular Expressions to split this string
String s = "[TEST name1=\"smith ben\" name2=\"Test\" abcd=\"Test=\" mmmm=\"Test=\"]";
into a list like below:
name1 smith ben
name2 Test
abcd Test=
mmmm Test=`
It is similar to getting attributes from an element but not quite.

The first thing to do is remove the brackets and 'TEST' part from the string so you are just left with the keys and values. Then you can split it (based on '\"') into an array, where the odd entries will be the keys, and the even entries will be the values. After that, it's easy enough to populate your list:
String s = "[TEST name1=\"smith ben\" name2=\"Test\" abcd=\"Test=\" mmmm=\"Test=\"]";
SortedList<string, string> list = new SortedList<string, string>();
//Remove the start and end tags
s = s.Remove(0, s.IndexOf(' '));
s = s.Remove(s.LastIndexOf('\"') + 1);
//Split the string
string[] pairs = s.Split(new char[] { '\"' }, StringSplitOptions.None);
//Add each pair to the list
for (int i = 0; i+1 < pairs.Length; i += 2)
{
string left = pairs[i].TrimEnd('=', ' ');
string right = pairs[i+1].Trim('\"');
list.Add(left, right);
}

Related

Fetch Occurrence of alphabet in a string c#

I have a string which look likes
E-1,E-2,F-3,F-1,G-1,E-2,F-5
Now i want output in array like
E, F, G
I only want the name of character once that appears in the string.
My Code Sample is as follows
string str1 = "E-1,E-2,F-3,F-1,G-1,E-2,F-5";
string[] newtmpSTR = str1.Split(new char[] { ',' });
Dictionary<string, string> tmpDict = new Dictionary<string, string>();
foreach(string str in newtmpSTR){
string[] tmpCharPart = str.Split('-');
if(!tmpDict.ContainsKey(tmpCharPart[0])){
tmpDict.Add(tmpCharPart[0], "");
}
}
Is there any easy way to do it in c#, using string function, If yes the how
string input = "E-1,E-2,F-3,F-1,G-1,E-2,F-5";
string[] splitted = input.Split(new char[] { ',' });
var letters = splitted.Select(s => s.Substring(0, 1)).Distinct().ToList();
Maybe you can obtain the same result with a regular expression! :-)

Get first word on every new line in a long string?

I am trying to add a leaderboard in my unity app
I have a long string as below(just an example, actual string is http pipe data from my web service, not manually stored):
string str ="name1|10|junk data.....\n
name2|9|junk data.....\n
name3|8|junk data.....\n
name4|7|junk data....."
I want to get the first word (string before the first pipe '|' like name1,name2...) from every line and store it in an array and then get the numbers (10,9,8... arter the '|') and store it in an other one.
Anyone know whats the best way to do this?
Fiddle here: https://dotnetfiddle.net/utp4HK
code below, you may want to revisit the algorithm for performance, but if that is not an issue, this will do the trick;
using System;
public class Program
{
public static void Main()
{
string str ="name1|10|junk data.....\nname2|9|junk data.....\nname3|8|junkdata.....\nname4|7|junk data.....";
foreach (var line in str.Split('\n'))
{
Console.WriteLine(line.Split('|')[0]);
}
}
}
First split by new-line characters:
string[] lines = str.Split(new string[]{Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
Then you can use LINQ to get both arrays:
var data = lines.Select(l => l.Trim().Split('|')).Where(arr => arr.Length > 1);
string[] names = data.Select(arr => arr[0].Trim()).ToArray();
string[] numbers = data.Select(arr => arr[1].Trim()).ToArray();
Check out this link on splitting strings: http://msdn.microsoft.com/en-us/library/ms228388.aspx
You could first create an array of strings (one for each line) by splitting the long string with \n as the delimeter.
Then, you could split each line with | as the delimeter. The name would be the 0th index of the array and the number would be the 1st index of the array.
First of all, you can't have a multi line string without using verbatim string literal. With using verbatim string literal, you can split your string based on \r\n or Environment.NewLine like;
string str = #"name1|10|junk data.....
name2|9|junk data.....
name3|8|junk data.....
name4|7|junk data.....";
var array = str.Split(new []{Environment.NewLine},
StringSplitOptions.RemoveEmptyEntries);
foreach (var item in array)
{
Console.WriteLine(item.Split(new[]{"|"},
StringSplitOptions.RemoveEmptyEntries)[0].Trim());
}
Output will be;
name1
name2
name3
name4
Try this:
string str ="name1|10|junk data.....\n" +
"name2|9|junk data.....\n" +
"name3|8|junk data.....\n" +
"name4|7|junk data.....";
string[] tempArray1 = str.Split('\n');
string[] tempArray2 = null;
string[,] newArray = null;
for (int i = 0; i < tempArray1.Length; i++)
{
tempArray2 = tempArray1[i].Split('|');
if (newArray[0, 0].ToString().Length == 0)
{
newArray = new string[tempArray1.Length, tempArray2.Length];
}
for (int j = 0; j < tempArray2.Length; j++)
{
newArray[i,j] = tempArray2[j];
}
}

Parse for words starting with # character in a string

I have to write a program which parses a string for words starting with '#' and return the words along with the # symbol.
I have tried something like:
char[] delim = { '#' };
string[] strArr = commenttext.Split(delim);
return strArr;
But it returns all the words without '#' in an array.
I need something pretty straight forward.No LINQ like things
If the string is "abc #ert #xyz" then I should get back #ert and #xyz.
If you define "word" as "separated by spaces" then this would work:
string[] strArr = commenttext.Split(' ')
.Where(w => w.StartsWith("#"))
.ToArray();
If you need something more complex, a Regular Expression might be more appropriate.
I need something pretty straight forward.No LINQ like things>
The non-Linq equivalent would be:
var words = commenttext.Split(' ');
List<string> temp = new List<string>();
foreach(string w in words)
{
if(w.StartsWith("#"))
temp.Add(w);
}
string[] strArr = temp.ToArray();
If you're against using Linq, which you should not be unless you're required to use older .NET versions, an approach along these lines would suit your needs.
string[] words = commenttext.Split(delimiter);
for (int i = 0; i < words.Length; i++)
{
string word = words[i];
if (word.StartsWith(delimiter))
{
// save in array / list
}
}
const string test = "#Amir abcdef #Stack #C# mnop xyz";
var splited = test.Split(' ').Where(m => m.StartsWith("#")).ToList();
foreach (var b in splited)
{
Console.WriteLine(b.Substring(1, b.Length - 1));
}
Console.ReadKey();

Extract node value from xml resembling string C#

I am having strings like below
<ad nameId="\862094\"></ad>
or comma seprated like below
<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad nameId="\865599\"></ad>
How to extract nameId value and store in single string like below
string extractedValues ="862094";
or in case of comma seprated string above
string extractedMultipleValues ="862593,862094,865599";
This is what I have started trying with but not sure
string myString = "<ad nameId="\862593\"></ad>,<ad nameId="\862094\"></ad>,<ad
nameId="\865599\"></ad>";
string[] myStringArray = myString .Split(',');
foreach (string str in myStringArray )
{
xd.LoadXml(str);
chkStringVal = xd.SelectSingleNode("/ad/#nameId").Value;
}
Search for:
<ad nameId="\\(\d*)\\"><\/ad>
Replace with:
$1
Note that you must search globally. Example: http://www.regex101.com/r/pL2lX1
Please see code below to extract all numbers in your example:
string value = #"<ad nameId=""\862093\""></ad>,<ad nameId=""\862094\""></ad>,<ad nameId=""\865599\""></ad>";
var matches = Regex.Matches(value, #"(\\\d*\\)", RegexOptions.RightToLeft);
foreach (Group item in matches)
{
string yourMatchNumber = item.Value;
}
Try like this;
string s = #"<ad nameId=""\862094\""></ad>";
if (!(s.Contains(",")))
{
string extractedValues = s.Substring(s.IndexOf("\\") + 1, s.LastIndexOf("\\") - s.IndexOf("\\") - 1);
}
else
{
string[] array = s.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
string extractedMultipleValues = "";
for (int i = 0; i < array.Length; i++)
{
extractedMultipleValues += array[i].Substring(array[i].IndexOf("\\") + 1, array[i].LastIndexOf("\\") - array[i].IndexOf("\\") - 1) + ",";
}
Console.WriteLine(extractedMultipleValues.Substring(0, extractedMultipleValues.Length -1));
}
mhasan, here goes an example of what you need(well almost)
EDITED: complete code (it's a little tricky)
(Sorry for the image but i have some troubles with tags in the editor, i can send the code by email if you want :) )
A little explanation about the code, it replaces all ocurrences of parsePattern in the given string, so if the given string has multiple tags separated by "," the final result will be the numbers separated by "," stored in parse variable....
Hope it helps

Extracting parts of a string c#

In C# what would be the best way of splitting this sort of string?
%%x%%a,b,c,d
So that I end up with the value between the %% AND another variable containing everything right of the second %%
i.e. var x = "x"; var y = "a,b,c,d"
Where a,b,c.. could be an infinite comma seperated list. I need to extract the list and the value between the two double-percentage signs.
(To combat the infinite part, I thought perhaps seperating the string out to: %%x%% and a,b,c,d. At this point I can just use something like this to get X.
var tag = "%%";
var startTag = tag;
int startIndex = s.IndexOf(startTag) + startTag.Length;
int endIndex = s.IndexOf(tag, startIndex);
return s.Substring(startIndex, endIndex - startIndex);
Would the best approach be to use regex or use lots of indexOf and substring to do the extracting based on te static %% characters?
Given that what you want is "x,a,b,c,d" the Split() function is actually pretty powerful and regex would be overkill for this.
Here's an example:
string test = "%%x%%a,b,c,d";
string[] result = test.Split(new char[] { '%', ',' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in result) {
Console.WriteLine(s);
}
Basicly we ask it to split by both '%' and ',' and ignore empty results (eg. the result between "%%"). Here's the result:
x
a
b
c
d
To Extract X:
If %% is always at the start then;
string s = "%%x%%a,b,c,d,h";
s = s.Substring(2,s.LastIndexOf("%%")-2);
//Console.WriteLine(s);
Else;
string s = "v,u,m,n,%%x%%a,b,c,d,h";
s = s.Substring(s.IndexOf("%%")+2,s.LastIndexOf("%%")-s.IndexOf("%%")-2);
//Console.WriteLine(s);
If you need to get them all at once then use this;
string s = "m,n,%%x%%a,b,c,d";
var myList = s.ToArray()
.Where(c=> (c != '%' && c!=','))
.Select(c=>c).ToList();
This'll let you do it all in one go:
string pattern = "^%%(.+?)%%(?:(.+?)(?:,|$))*$";
string input = "%%x%%a,b,c,d";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
// "x"
string first = match.Groups[1].Value;
// { "a", "b", "c", "d" }
string[] repeated = match.Groups[2].Captures.Cast<Capture>()
.Select(c => c.Value).ToArray();
}
You can use the char.IsLetter to get all the list of letter
string test = "%%x%%a,b,c,d";
var l = test.Where(c => char.IsLetter(c)).ToArray();
var output = string.Join(", ", l.OrderBy(c => c));
Since you want the value between the %% and everything after in separate variables and you don't need to parse the CSV, I think a RegEx solution would be your best choice.
var inputString = #"%%x%%a,b,c,d";
var regExPattern = #"^%%(?<x>.+)%%(?<csv>.+)$";
var match = Regex.Match(inputString, regExPattern);
foreach (var item in match.Groups)
{
Console.WriteLine(item);
}
The pattern has 2 named groups called x and csv, so rather than just looping, you can easily reference them by name and assign them to values:
var x = match.Groups["x"];
var y = match.Groups["csv"];

Categories