How to perform word search using LINQ? - c#

I have a list which contains the name of suppliers. Say
SuppId Supplier Name
----------------------------------
1 Aardema & Whitelaw
2 Aafedt Forde Gray
3 Whitelaw & Sears-Ewald
using following LINQ query
supplierListQuery = supplierListQuery.Where(x => x.SupplierName.Contains(SearchKey));
I can return records correctly in the following conditions,
1) If i am using search string as "Whitelaw & Sears-Ewald" it will return 3rd record.
2) If i am using "Whitelaw" or "Sears-Ewald" it will return 3rd record.
But how can i return 3rd record if i am giving search string as "Whitelaw Sears-Ewald". It always returns 0 records.
Can i use ALL to get this result, but i dont know how to use it for this particular need.

What I usually do in this situation is split the words into a collection, then perform the following:
var searchopts = SearchKey.Split(' ').ToList();
supplierListQuery = supplierListQuery
.Where(x => searchopts.Any(y=> x.SupplierName.Contains(y)));

This works for me:
IEnumerable<string> keyWords = SearchKey.Split('');
supplierListQuery = supplierListQuery
.AsParallel()
.Where
(
x => keyWords.All
(
keyword => x.SupplierName.ContainsIgnoreCase(keyword)
)
);

Thank you all for your quick responses. But the one which worked or a easy fix to handle this was timothyclifford's note on this. Like he said i alterd my answer to this
string[] filters = SearchKey.ToLower().Split(new[] { ' ' });
objSuppliersList = (from x in objSuppliersList
where filters.All(f => x.SupplierName.ToLower().Contains(f))
select x).ToList();
Now it returns the result for all my serach conditions.

Because "Whitelaw" appears in both you will get both records. Otherwise there is no dynamic way to determine you only want the last one. If you know you only have these 3 then append .Last() to get the final record.
supplierListQuery = supplierListQuery.Where(x => x.SupplierName.Contains(SearchKey.Split(' ')[0]));

You need to use some sort of string comparer to create your own simple Search Engine and then you can find strings that are most likely to be included in your result :
public static class SearchEngine
{
public static double CompareStrings(string val1, string val2)
{
if ((val1.Length == 0) || (val2.Length == 0)) return 0;
if (val1 == val2) return 100;
double maxLength = Math.Max(val1.Length, val2.Length);
double minLength = Math.Min(val1.Length, val2.Length);
int charIndex = 0;
for (int i = 0; i < minLength; i++) { if (val1.Contains(val2[i])) charIndex++; }
return Math.Round(charIndex / maxLength * 100);
}
public static List<string> Search(this string[] values, string searchKey, double threshold)
{
List<string> result = new List<string>();
for (int i = 0; i < values.Length; i++) if (CompareStrings(values[i], searchKey) > threshold) result.Add(values[i]);
return result;
}
}
Example of usage :
string[] array = { "Aardema & Whitelaw", "Aafedt Forde Gray", "Whitelaw & Sears-Ewald" };
var result = array.Search("WhitelawSears-Ewald", 80);
// Results that matches this string with 80% or more
foreach (var item in result)
{
Console.WriteLine(item);
}
Output: Whitelaw & Sears-Ewald

If you want an easy (not very handy) solution,
var result = supplierListQuery
.Select(x => normalize(x.SupplierName))
.Where(x => x.Contains(normalize(SearchKey)));
string normalize(string inputStr)
{
string retVal = inputStr.Replace("&", "");
while (retVal.IndexOf(" ") >= 0)
{
retVal = retVal.Replace(" ", " ");
}
return retVal;
}

Related

Split and then Joining the String step by step - C# Linq

Here is my string:
www.stackoverflow.com/questions/ask/user/end
I split it with / into a list of separated words:myString.Split('/').ToList()
Output:
www.stackoverflow.com
questions
ask
user
end
and I need to rejoin the string to get a list like this:
www.stackoverflow.com
www.stackoverflow.com/questions
www.stackoverflow.com/questions/ask
www.stackoverflow.com/questions/ask/user
www.stackoverflow.com/questions/ask/user/end
I think about linq aggregate but it seems it is not suitable here. I want to do this all through linq
You can try iterating over it with foreach
var splitted = "www.stackoverflow.com/questions/ask/user/end".Split('/').ToList();
string full = "";
foreach (var part in splitted)
{
full=$"{full}/{part}"
Console.Write(full);
}
Or use linq:
var splitted = "www.stackoverflow.com/questions/ask/user/end".Split('/').ToList();
var list = splitted.Select((x, i) => string.Join("/", a.Take(i + 1)));
Linq with side effect:
string prior = null;
var result = "www.stackoverflow.com/questions/ask/user/end"
.Split('/')
.Select(item => prior == null
? prior = item
: prior += "/" + item)
.ToList();
Let's print it out
Console.WriteLine(string.Join(Environment.NewLine, result));
Outcome:
www.stackoverflow.com
www.stackoverflow.com/questions
www.stackoverflow.com/questions/ask
www.stackoverflow.com/questions/ask/user
www.stackoverflow.com/questions/ask/user/end
Linq without side effects ;)
Enumerable.Aggregate can be used here if we use List<T> as a result.
var raw = "www.stackoverflow.com/questions/ask/user/end";
var actual =
raw.Split('/')
.Aggregate(new List<string>(),
(list, word) =>
{
var combined = list.Any() ? $"{list.Last()}/{word}" : word;
list.Add(combined);
return list;
});
without Linq write below code,
var str = "www.stackoverflow.com/questions/ask/user/end";
string[] full = str.Split('/');
string Result = string.Empty;
for (int i = 0; i < full.Length; i++)
{
Console.WriteLine(full[i]);
}
for (int i = 0; i < full.Length; i++)
{
if (i == 0)
{
Result = full[i];
}
else
{
Result += "/" + full[i];
}
Console.WriteLine(Result);
}

Check if string contains characters in certain order in C#r

I have a code that's working right now, but it doesn't check if the characters are in order, it only checks if they're there. How can I modify my code so the the characters 'gaoaf' are checked in that order in the string?
Console.WriteLine("5.feladat");
StreamWriter sw = new StreamWriter("keres.txt");
sw.WriteLine("gaoaf");
string s = "";
for (int i = 0; i < n; i++)
{
s = zadatok[i].nev+zadatok[i].cim;
if (s.Contains("g") && s.Contains("a") && s.Contains("o") && s.Contains("a") && s.Contains("f") )
{
sw.WriteLine(i);
sw.WriteLine(zadatok[i].nev + zadatok[i].cim);
}
}
sw.Close();
You can convert the letters into a pattern and use Regex:
var letters = "gaoaf";
var pattern = String.Join(".*",letters.AsEnumerable());
var hasletters = Regex.IsMatch(s, pattern, RegexOptions.IgnoreCase);
For those that needlessly avoid .*, you can also solve this with LINQ:
var ans = letters.Aggregate(0, (p, c) => p >= 0 ? s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase) : p) != -1;
If it is possible to have repeated adjacent letters, you need to complicate the LINQ solution slightly:
var ans = letters.Aggregate(0, (p, c) => {
if (p >= 0) {
var newp = s.IndexOf(c.ToString(), p, StringComparison.InvariantCultureIgnoreCase);
return newp >= 0 ? newp+1 : newp;
}
else
return p;
}) != -1;
Given the (ugly) machinations required to basically terminate Aggregate early, and given the (ugly and inefficient) syntax required to use an inline anonymous expression call to get rid of the temporary newp, I created some extensions to help, an Aggregate that can terminate early:
public static TAccum AggregateWhile<TAccum, T>(this IEnumerable<T> src, TAccum seed, Func<TAccum, T, TAccum> accumFn, Predicate<TAccum> whileFn) {
using (var e = src.GetEnumerator()) {
if (!e.MoveNext())
throw new Exception("At least one element required by AggregateWhile");
var ans = accumFn(seed, e.Current);
while (whileFn(ans) && e.MoveNext())
ans = accumFn(ans, e.Current);
return ans;
}
}
Now you can solve the problem fairly easily:
var ans2 = letters.AggregateWhile(-1,
(p, c) => s.IndexOf(c.ToString(), p+1, StringComparison.InvariantCultureIgnoreCase),
p => p >= 0
) != -1;
Why not something like this?
static bool CheckInOrder(string source, string charsToCheck)
{
int index = -1;
foreach (var c in charsToCheck)
{
index = source.IndexOf(c, index + 1);
if (index == -1)
return false;
}
return true;
}
Then you can use the function like this:
bool result = CheckInOrder("this is my source string", "gaoaf");
This should work because IndexOf returns -1 if a string isn't found, and it only starts scanning AFTER the previous match.

Sort a List in which each element contains 2 Values

I have a text file that contains Values in this Format: Time|ID:
180|1
60 |2
120|3
Now I want to sort them by Time. The Output also should be:
60 |2
120|3
180|1
How can I solve this problem? With this:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list.Sort();
for (var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
I got no success ...
3 steps are necessary to do the job:
1) split by the separator
2) convert to int because in a string comparison a 6 comes after a 1 or 10
3) use OrderBy to sort your collection
Here is a linq solution in one line doing all 3 steps:
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
Explanation
x => lambda expression, x denotes a single element in your list
x.Split('|')[0] splits each string and takes only the first part of it (time)
Convert.ToInt32(.. converts the time into a number so that the ordering will be done in the way you desire
list.OrderBy( sorts your collection
EDIT:
Just to understand why you got the result in the first place here is an example of comparison of numbers in string representation using the CompareTo method:
int res = "6".CompareTo("10");
res will have the value of 1 (meaning that 6 is larger than 10 or 6 follows 10)
According to the documentation->remarks:
The CompareTo method was designed primarily for use in sorting or alphabetizing operations.
You should parse each line of the file content and get values as numbers.
string[] lines = File.ReadAllLines("path");
// ID, time
var dict = new Dictionary<int, int>();
// Processing each line of the file content
foreach (var line in lines)
{
string[] splitted = line.Split('|');
int time = Convert.ToInt32(splitted[0]);
int ID = Convert.ToInt32(splitted[1]);
// Key = ID, Value = Time
dict.Add(ID, time);
}
var orderedListByID = dict.OrderBy(x => x.Key).ToList();
var orderedListByTime = dict.OrderBy(x => x.Value).ToList();
Note that I use your ID reference as Key of dictionary assuming that ID should be unique.
Short code version
// Key = ID Value = Time
var orderedListByID = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Key).ToList();
var orderedListByTime = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Value).ToList();
You need to convert them to numbers first. Sorting by string won't give you meaningful results.
times = list.Select(l => l.Split('|')[0]).Select(Int32.Parse);
ids = list.Select(l => l.Split('|')[1]).Select(Int32.Parse);
pairs = times.Zip(ids, (t, id) => new{Time = t, Id = id})
.OrderBy(x => x.Time)
.ToList();
Thank you all, this is my Solution:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
for(var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class TestClass {
public static void main(String[] args) {
List <LineItem> myList = new ArrayList<LineItem>();
myList.add(LineItem.getLineItem(500, 30));
myList.add(LineItem.getLineItem(300, 20));
myList.add(LineItem.getLineItem(900, 100));
System.out.println(myList);
Collections.sort(myList);
System.out.println("list after sort");
System.out.println(myList);
}
}
class LineItem implements Comparable<LineItem>{
int time;
int id ;
#Override
public String toString() {
return ""+ time + "|"+ id + " ";
}
#Override
public int compareTo(LineItem o) {
return this.time-o.time;
}
public static LineItem getLineItem( int time, int id ){
LineItem l = new LineItem();
l.time=time;
l.id=id;
return l;
}
}

Find a fixed length string with specific string part in C#

I want to find a string of fixed length with specific substring. But I need to do it like we can do in SQL queries.
Example:
I have strings like -
AB012345
AB12345
AB123456
AB1234567
AB98765
AB987654
I want to select strings that have AB at first and 6 characters afterwards. Which can be done in SQL by SELECT * FROM [table_name] WHERE [column_name] LIKE 'AB______' (6 underscores after AB).
So the result will be:
AB012345
AB123456
AB987654
I need to know if there is any way to select strings in such way with C#, by using AB______.
You can use Regular Expressions to filter the result:
List<string> sList = new List<string>(){"AB012345",
"AB12345",
"AB123456",
"AB1234567",
"AB98765",
"AB987654"};
var qry = sList.Where(s=>Regex.Match(s, #"^AB\d{6}$").Success);
Considering you have an string array:
string[] str = new string[3]{"AB012345", "A12345", "AB98765"};
var result = str.Where(x => x.StartsWith("AB") && x.Length == 8).ToList();
The logic is if it starts with AB, and its length is 8. It is your best match.
this should do it
List<string> sList = new List<string>(){
"AB012345",
"AB12345",
"AB123456",
"AB1234567",
"AB98765",
"AB987654"};
List<string> sREsult = sList.Where(x => x.Length == 8 && x.StartsWith("AB")).ToList();
first x.Length == 8 determines the length and x.StartsWith("AB") determines the required characters at the start of the string
This can be achieved by using string.Startwith and string.Length function like this:
public bool CheckStringValid (String input)
{
if (input.StartWith ("AB") && input.Length == 8)
{
return true;
}
else
{
return false;
}
}
This will return true if string matches your criteria.
Hope this helps.
var strlist = new List<string>()
{
"AB012345",
"AB12345",
"AB123456",
"AB1234567",
"AB98765",
"AB987654"
};
var result = strlist.Where(
s => (s.StartsWith("AB") &&(s.Length == 8))
);
foreach(var v in result)
{
Console.WriteLine(v.ToString());
}

Select interval linq

Is there some way with LINQ to select certain numbers with shortcut criteria.
Like this:
I have numbers from 1 to 10000.
My criteria is (4012..4190|4229), meaning take numbers between 4012 to 4190 and number 4229:
static int[] test(string criteria)
{
// criteria is 4012..4190|4229
// select numbers from lab where criteria is met
int[] lab = Enumerable.Range(0, 10000).ToArray();
return lab;
}
This should be enough for your case:
return lab.Where((int1) => (int1 >= 4012 && int1 <= 4190) || int1 == 4229).ToArray();
Also a quick way of parsing your criteria would be to use RegEx:
Regex r = new Regex(#"\d+");
MatchCollection m = r.Matches(criteria);
int start = int.Parse(m[0].Value);
int end = int.Parse(m[1].Value);
int specific = int.Parse(m[2].Value);
return lab.Where((int1) => (int1 >= start && int1 <= end) || int1 == specific).ToArray();
If your criteria is always a string, you need some way to parse it, to Func<int, bool, but it's not LINQ specific. In the end you'll need something like this:
Func<int, bool> predicate = Parse(criteria);
return lab.Where(predicate).ToArray();
where very basic implementation of Parse might look as follows:
public static Func<int, bool> Parse(string criteria)
{
var alternatives = criteria
.Split('|')
.Select<string, Func<int, bool>>(
token =>
{
if (token.Contains(".."))
{
var between = token.Split(new[] {".."}, StringSplitOptions.RemoveEmptyEntries);
int lo = int.Parse(between[0]);
int hi = int.Parse(between[1]);
return x => lo <= x && x <= hi;
}
else
{
int exact = int.Parse(token);
return x => x == exact;
}
})
.ToArray();
return x => alternatives.Any(alt => alt(x));
}
You can concatenate two sequenses
int[] lab = Enumerable.Range(4012, 4190-4012).Concat(Enumerable.Range(4229,1)).ToArray();
Update:
you need to parse incoming criteria first
static int[] test(string criteria)
{
// criteria is 4012..4190|4229
// select numbers from lab where criteria is met
// assume you parsed your criteria to 2 dimentional array
// I used count for second part for convience
int[][] criteriaArray = { new int[]{ 4012, 50 }, new int[]{ 4229, 1 } };
var seq = Enumerable.Range(criteriaArray[0][0], criteriaArray[0][1]);
for (int i = 1; i < criteriaArray.Length; i++)
{
int start = criteriaArray[i][0];
int count = criteriaArray[i][1];
seq = seq.Concat(Enumerable.Range(start, count));
}
return seq.ToArray();
}
You could :
Flatten[{Range[4012, 4190], 4229}]
And in some way this would work as well 4012..4190|4229, but answer is exactly that - list of items from 4012 to 4190 and item 4229.
Lambda just imitates pure functions. However unless you have free wolfram kernel, using this approach might no be most cost effective. However, you do not need to write boilerplate code.

Categories