i want to parse my script file by regex in c# - c#

i have a script file like below
[grade]
`[achievement]`
[gold multiple]
250
[level]
34
99
[pre required quest]
38
[/pre required quest]
for example:
lex("grade") return "`[achievement]`"
lex("level") return "34,99"
may be i can do it by linq, but i don't find a way
i tried
scripts = File.ReadAllText(scriptFilePath);
string gradeKeyword = #"(?<=\[grade\]\r\n).*?\r\n*(?=\[.*\]\r\n)"
Regex reg = new Regex(gradeKeyword);
Match mat = reg.Match(scripts);
it didn't work(which i want to get [achievement])
BTW, can i do that by linq?

You could try not using a regex.
public static IEnumerable<string> GetScriptSection(string file, string section)
{
var startMatch = string.Format("[{0}]", section);
var endMatch = string.Format("[/{0}]", section);
var lines = file.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries).Select(s => s.Trim()).ToList();
int startIndex = lines.FindIndex(f => f == startMatch) + 1;
int endIndex = lines.FindLastIndex(f => f == endMatch);
if(endIndex == -1)
{
endIndex = lines.FindIndex(startIndex, f => f.StartsWith("[") && lines.IndexOf(f) > startIndex);
endIndex = endIndex == -1 ? lines.Count : endIndex;
}
return lines.GetRange(startIndex, endIndex - startIndex).Where(l => !string.IsNullOrWhiteSpace(l)).ToList();
}
But I would just use YAML, XML, or some other well used format instead of rolling my own.

.NET regular expressions default to not matching across line breaks. You have to specify the RegexOptions.SingleLine option.

Try following expresion:
string gradeKeyword = "\\[level\\]\\r\\n([^\\[].+\\r\\n)+";

Related

How to perform word search using LINQ?

I have a list which contains the name of suppliers. Say
SuppId Supplier Name
----------------------------------
1 Aardema & Whitelaw
2 Aafedt Forde Gray
3 Whitelaw & Sears-Ewald
using following LINQ query
supplierListQuery = supplierListQuery.Where(x => x.SupplierName.Contains(SearchKey));
I can return records correctly in the following conditions,
1) If i am using search string as "Whitelaw & Sears-Ewald" it will return 3rd record.
2) If i am using "Whitelaw" or "Sears-Ewald" it will return 3rd record.
But how can i return 3rd record if i am giving search string as "Whitelaw Sears-Ewald". It always returns 0 records.
Can i use ALL to get this result, but i dont know how to use it for this particular need.
What I usually do in this situation is split the words into a collection, then perform the following:
var searchopts = SearchKey.Split(' ').ToList();
supplierListQuery = supplierListQuery
.Where(x => searchopts.Any(y=> x.SupplierName.Contains(y)));
This works for me:
IEnumerable<string> keyWords = SearchKey.Split('');
supplierListQuery = supplierListQuery
.AsParallel()
.Where
(
x => keyWords.All
(
keyword => x.SupplierName.ContainsIgnoreCase(keyword)
)
);
Thank you all for your quick responses. But the one which worked or a easy fix to handle this was timothyclifford's note on this. Like he said i alterd my answer to this
string[] filters = SearchKey.ToLower().Split(new[] { ' ' });
objSuppliersList = (from x in objSuppliersList
where filters.All(f => x.SupplierName.ToLower().Contains(f))
select x).ToList();
Now it returns the result for all my serach conditions.
Because "Whitelaw" appears in both you will get both records. Otherwise there is no dynamic way to determine you only want the last one. If you know you only have these 3 then append .Last() to get the final record.
supplierListQuery = supplierListQuery.Where(x => x.SupplierName.Contains(SearchKey.Split(' ')[0]));
You need to use some sort of string comparer to create your own simple Search Engine and then you can find strings that are most likely to be included in your result :
public static class SearchEngine
{
public static double CompareStrings(string val1, string val2)
{
if ((val1.Length == 0) || (val2.Length == 0)) return 0;
if (val1 == val2) return 100;
double maxLength = Math.Max(val1.Length, val2.Length);
double minLength = Math.Min(val1.Length, val2.Length);
int charIndex = 0;
for (int i = 0; i < minLength; i++) { if (val1.Contains(val2[i])) charIndex++; }
return Math.Round(charIndex / maxLength * 100);
}
public static List<string> Search(this string[] values, string searchKey, double threshold)
{
List<string> result = new List<string>();
for (int i = 0; i < values.Length; i++) if (CompareStrings(values[i], searchKey) > threshold) result.Add(values[i]);
return result;
}
}
Example of usage :
string[] array = { "Aardema & Whitelaw", "Aafedt Forde Gray", "Whitelaw & Sears-Ewald" };
var result = array.Search("WhitelawSears-Ewald", 80);
// Results that matches this string with 80% or more
foreach (var item in result)
{
Console.WriteLine(item);
}
Output: Whitelaw & Sears-Ewald
If you want an easy (not very handy) solution,
var result = supplierListQuery
.Select(x => normalize(x.SupplierName))
.Where(x => x.Contains(normalize(SearchKey)));
string normalize(string inputStr)
{
string retVal = inputStr.Replace("&", "");
while (retVal.IndexOf(" ") >= 0)
{
retVal = retVal.Replace(" ", " ");
}
return retVal;
}

Regex to split by a Targeted String up to a certain character

I have an LDAP Query I need to build the domain.
So, split by "DC=" up to a "comma"
INPUT:
LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account
RESULT:
SOMETHING.ELSE.NET
You can do it pretty simple using DC=(\w*) regex pattern.
var str = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var result = String.Join(".", Regex.Matches(str, #"DC=(\w*)")
.Cast<Match>()
.Select(m => m.Groups[1].Value));
Without Regex you can do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
int startIndex = ldapStr.IndexOf("DC=");
int length = ldapStr.LastIndexOf("DC=") - startIndex;
string output = null;
if (startIndex >= 0 && length <= ldapStr.Length)
{
string domainComponentStr = ldapStr.Substring(startIndex, length);
output = String.Join(".",domainComponentStr.Split(new[] {"DC=", ","}, StringSplitOptions.RemoveEmptyEntries));
}
If you are always going to get the string in similar format than you can also do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var outputStr = String.Join(".", ldapStr.Split(new[] {"DC=", ",","\\"}, StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Take(3));
And you will get:
outputStr = "SOMETHINGS.ELSE.NET"

how To get specific part of a string in c#

I have a string
string a = "(something is there),xyz,(something there)";
and, I use this
string s = "(something is there),xyz,(something there)";
int start = s.IndexOf("(") + 1;
int end = s.IndexOf(")", start);
string result = s.Substring(start, end - start);
but I want to use the second part (something there)
how can I do it?
a.Split("(),".ToCharArray(),StringSplitOptions.RemoveEmptyEntries);
This will return an array with 3 strings: something is there, xyz, and something there
Not sure what exactly you're doing around this, however this does it in this specific case:
var last = s.Split(',').Last(); // "(something there)"
Or more verbosely for explanation:
var s = "(something is there),xyz,(something there)";
var split = s.Split(','); // [ "(something is there)", "xyz", "(something there)" ]
var last = split.Last(); // "(something there)"
And if you don't want the brackets(en-GB)
var content = last.Trim('(', ')'); // "something there"
If "last" is the same as "second" in this case you can use String.LastIndexOf:
string lastPart = null;
int lastStartIndex = a.LastIndexOf('(');
if (lastStartIndex >= 0)
{
int lastEndIndex = a.LastIndexOf(')');
if (lastEndIndex >= 0)
lastPart = a.Substring(++lastStartIndex, lastEndIndex - lastStartIndex);
}
Here is a solution which extracts all tokens from the string into a List<string>:
int startIndex = -1, endIndex = -1;
var tokens = new List<string>();
while (true)
{
startIndex = a.IndexOf('(', ++endIndex);
if (startIndex == -1) break;
endIndex = a.IndexOf(')', ++startIndex);
if (endIndex == -1) break;
tokens.Add(a.Substring(startIndex, endIndex - startIndex));
}
So now you could use the indexer or Enumerable.ElementAtOrDefault:
string first = tokens[0];
string second = tokens.ElementAtOrDefault(1);
If the list is too small you get null as result. If you just want the last use tokens.Last().
You can use this:
string s = "(something is there),xyz,(something there)";
var start = s.Split(',')[2];
Also You can use:
string s = "(something is there),xyz,(something there)";
Regex regex = new Regex(#"\([^()]*\)(?=[^()]*$)");
Match match = regex.Match("(something is there),xyz,(something there)");
var result = match.Value;
You could use the following if you just want the text:
var s = "(something is there),xyz,(something there)";
var splits = s.Split('(');
var text = splits[2].Trim(')');
If you want to get the text between second '(' and ')' then use the second parameter of IndexOf which sets the starting index for searching
start = s.IndexOf("(", end) + 1;
end = s.IndexOf(")", start);
string secondResult = s.Substring(start, end - start);
If you want to get the string after the last ) use this code:
string otherPart = s.Substring(end+1);

Split a string containing digits

I'm having a string like,
"abc kskd 8.900 prew"
need to Split this string so that i get the result as "abc kskd" and "8.900 prew"
how can i achieve this with C#?
Get the index of first digit using LINQ then use Substring:
var input = "abc kskd 8.900 prew";
var index = input.Select( (x,idx) => new {x, idx})
.Where(c => char.IsDigit(c.x))
.Select(c => c.idx)
.First();
var part1 = input.Substring(0, index);
var part2 = input.Substring(index);
This should do if you don't need to do something complicated:
var data = "abc kskd 8.900 prew";
var digits = "0123456789".ToCharArray();
var idx = data.IndexOfAny(digits);
if (idx != -1)
{
var firstPart = data.Substring(0, idx - 1);
var secondPart = data.Substring(idx);
}
IndexOfAny is actually very fast.
This could also be modified to separate the string into more parts (using the startIndex parameter), but you didn't ask for that.
straightforward with a regular expression:
var str = "abc kskd 8.900 prew";
var result = Regex.Split(str, #"\W(\d.*)").Where(x => x!="").ToArray();
Try this,
public string[] SplitText(string text)
{
var startIndex = 0;
while (startIndex < text.Length)
{
var index = text.IndexOfAny("0123456789".ToCharArray(), startIndex);
if (index < 0)
{
break;
}
var spaceIndex = text.LastIndexOf(' ', startIndex, index - startIndex);
if (spaceIndex != 0)
{
return new String[] { text.Substring(0, spaceIndex), text.Substring(spaceIndex + 1) };
}
startIndex = index;
}
return new String[] {text};
}
Something similar to what #Dominic Kexel provided, but only if you don't want to use linq.
string[] result = Regex.Split("abc kskd 8.900 prew", #"\w*(?=\d+\.\d)");

How to remove characters from a string using LINQ

I'm having a String like
XQ74MNT8244A
i nee to remove all the char from the string.
so the output will be like
748244
How to do this?
Please help me to do this
new string("XQ74MNT8244A".Where(char.IsDigit).ToArray()) == "748244"
Two options. Using Linq on .Net 4 (on 3.5 it is similar - it doesn't have that many overloads of all methods):
string s1 = String.Concat(str.Where(Char.IsDigit));
Or, using a regular expression:
string s2 = Regex.Replace(str, #"\D+", "");
I should add that IsDigit and \D are Unicode-aware, so it accepts quite a few digits besides 0-9, for example "542abc٣٤".
You can easily adapt them to a check between 0 and 9, or to [^0-9]+.
string value = "HTQ7899HBVxzzxx";
Console.WriteLine(new string(
value.Where(x => (x >= '0' && x <= '9'))
.ToArray()));
If you need only digits and you really want Linq try this:
youstring.ToCharArray().Where(x => char.IsDigit(x)).ToArray();
Using LINQ:
public string FilterString(string input)
{
return new string(input.Where(char.IsNumber).ToArray());
}
Something like this?
"XQ74MNT8244A".ToCharArray().Where(x => { var i = 0; return Int32.TryParse(x.ToString(), out i); })
string s = "XQ74MNT8244A";
var x = new string(s.Where(c => (c >= '0' && c <= '9')).ToArray());
How about an extension method (and overload) that does this for you:
public static string NumbersOnly(this string Instring)
{
return Instring.NumbersOnly("");
}
public static string NumbersOnly(this string Instring, string AlsoAllowed)
{
char[] aChar = Instring.ToCharArray();
int intCount = 0;
string strTemp = "";
for (intCount = 0; intCount <= Instring.Length - 1; intCount++)
{
if (char.IsNumber(aChar[intCount]) || AlsoAllowed.IndexOf(aChar[intCount]) > -1)
{
strTemp = strTemp + aChar[intCount];
}
}
return strTemp;
}
The overload is so you can retain "-", "$" or "." as well, if you wish (instead of strictly numbers).
Usage:
string numsOnly = "XQ74MNT8244A".NumbersOnly();

Categories