find the longest match of a string in list c# - c#

I have a list of string I need to find the longest match of my search string in the list.
for example the list contains : "test", "abc", "testing", "testingap" and my search string is 'testingapplication'
the result should be 'testingap'
here is what I did so far , it does the work but I'm looking is there any better efficient way to do this
string search= "testingapplication";
List<string> names = new List<string>(new[] { "test", "abc", "testing", "testingap" });
List<string> matchedItems = new List<string>();
foreach (string item in names)
{
if (search.Contains(item))
{
matchedItems.Add(item);
Console.WriteLine(item);
}
}
var WordMatch= matchedItems.Aggregate("", (max, cur) => max.Length > cur.Length ? max : cur);
Console.WriteLine("WordMatch"+WordMatch);

Since you are already using LINQ, you could consider ordering your "names" by length via the OrderByDescending() method and grab the first that contains your string using FirstOrDefault() as seen below:
var match = names.OrderByDescending(n => n.Length)
.FirstOrDefault(n => search.Contains(n));
if (match == null)
{
// No match was found, handle accordingly.
}
else
{
// match will contain your longest string
}

Related

Contains without order

I want to search a list of strings using a set of characters and want to find matches regardless of order. For example if my list contains
List<string> testList = new List<string>() { "can", "rock", "bird" };
I want to be able to search using "irb" and have it return bird. I have to do this many times so I am looking for the most efficient way of doing it.
var query = "irb";
List<string> testList = new List<string>() { "can", "rock", "bird" };
var result = testList.Where(i => query.All(q => i.Contains(q)));
For each item in the testList test to see if it contains all the letters in query
For your scenario, you need to check each character of word in another list of word.
For that, you can do like this :
// Checks whether all character in word is present in another word
Func<string, string, bool> isContain = (s1, s2) =>
{
int matchingLength = 0;
foreach (var c2 in s2.ToCharArray())
{
foreach (var c1 in s1.ToCharArray())
{
if (c1 == c2)
++matchingLength;
}
}
// if matched length is equal to word length given, it would be assumed as matched
return s2.Length == matchingLength;
};
List<string> testList = new List<string>() { "can", "rock", "bird" };
string name = "irb";
var fileredList = testList.Where(x => isContain(x, name));
If you don't care about matching duplicates than checking if all characters in a sequence you are searching for are contained in the word would do for predicate:
"irb".Except("bird").Count() == 0
And whole condition:
List<string> testList = new List<string>() { "can", "rock", "bird" };
var search = "irb";
var matches = testList.Where(word => !search.Except(word).Any());
Notes:
you need to normalize all words to lowercase if you need mixed case letters to match.
if performance of searching for different values is critical - convert search string to HashSet first and do except manually.
if you need to match different values against same list many times - convert list of strings to list of HashSet and use search.All(c => wordAsHashSet.Contains(c)) as condition.
You can use linq to achieve this
List<string> testList = new List<string>() { "can", "rock", "bird" };
var lst = testList.Where(x => x.ToUpperInvariant().Contains("IRD")).ToList();
Make sure you also compare the cases using ToUpper and the string you want to compare also make it UpperCase

List.Any get matched String

FilePrefixList.Any(s => FileName.StartsWith(s))
Can I get s value here? I want to display the matched string.
Any determines only if there is a match, it doesn't return anything apart from the bool and it needs to execute the query.
You can use Where or First/FirstOrDefault:
string firstMastch = FilePrefixList.FirstOrDefault(s => FileName.StartsWith(s)); // null if no match
var allMatches = FilePrefixList.Where(s => FileName.StartsWith(s));
string firstMastch = allMatches.FirstOrDefault(); // null if no match
So Any is fine if all you need to know is if ther's a match, otherwise you can use FirstOrDefault to get the first match or null(in case of reference types).
Since Any needs to execute the query this is less efficient:
string firstMatch = null;
if(FilePrefixList.Any(s => FileName.StartsWith(s)))
{
// second execution
firstMatch = FilePrefixList.First(s => FileName.StartsWith(s));
}
If you want to put all matches into a separate collection like a List<string>:
List<string> matchList = allMatches.ToList(); // or ToArray()
If you want to output all matches you can use String.Join:
string matchingFiles = String.Join(",", allMatches);
Not with Any, no... that's only meant to determine whether there are any matches, which is why it returns bool. However, you can use FirstOrDefault with a predicate instead:
var match = FilePrefixList.FirstOrDefault(s => FileName.StartsWith(s));
if (match != null)
{
// Display the match
}
else
{
// Nothing matched
}
If you want to find all the matches, use Where instead.
if FilePrefixList is a List<string>, you can use List<T>.Find method:
string first = FilePrefixList.Find(s => FileName.StartsWith(s));
fiddle: List.Find vs LINQ (Find is faster)
List<T>.Find (MSDN) returns the first element that matches the conditions defined by the specified predicate, if found; otherwise, the default value for type T
Enumerable.Any() returns bool denoting whether any item matched the criteria.
If you need the matched item, use SingleOrDefault() instead:
var matchedPrefix = FilePrefixList.SingleOrDefault(s => FileName.StartsWith(s));
See MSDN
please check try this:
we assuming FilePrefixList is collectionlist
class A
{
public int ID { get; set; }
public string Name { get; set; }
}
List<A> FilePrefixList= new List<A>();
FilePrefixList.Add(new A
{
ID = 1,
Name = "One"
});
FilePrefixList.Add(new A
{
ID =2,
Name = "Two"
});
FilePrefixList.Add(new A
{
ID = 3,
Name = "Three"
});
select data from list is:
var listItems = FilePrefixList.Where(x =>x.Name.StartsWith("T")).ToList();

How can I remove numbers/digits from strings in a List<string>?

I have a List of strings:
List<string> _words = ExtractWords(strippedHtml);
_words contains 1799 indexes; in each index there is a string.
Some of the strings contain only numbers, for example:
" 2" or "2013"
I want to remove these strings and so in the end the List will contain only strings with letters and not digits.
A string like "001hello" is OK but "001" is not OK and should be removed.
You can use LINQ for that:
_words = _words.Where(w => w.Any(c => !Char.IsDigit(c))).ToList();
This would filter out strings that consist entirely of digits, along with empty strings.
_words = _words.Where(w => !w.All(char.IsDigit))
.ToList();
For removing words that are only made of digits and whitespace:
var good = new List<string>();
var _regex = new Regex(#"^[\d\s]*$");
foreach (var s in _words) {
if (!_regex.Match(s).Success)
good.Add(s);
}
If you want to use LINQ something like this should do:
_words = _words.Where(w => w.Any(c => !char.IsDigit(c) && !char.IsWhiteSpace(c)))
.ToList();
You can use a traditional foreach and Integer.TryParse to detect numbers.
This will be faster than Regex or LINQ.
var stringsWithoutNumbers = new List<string>();
foreach (var str in _words)
{
int n;
bool isNumeric = int.TryParse(str, out n);
if (!isNumeric)
{
stringsWithoutNumbers.Add(str);
}
}

search the database for the words within a string

Imagine that a user entered a sentence and I need to search for the subjects that consist of words within the entered sentence. These are the code that I thought they could solve the case.
var result = from x in dataBase.tableName
select x;
string[] words = enteredString.Split();
foreach(string word in words)
{
result = result.Where(x => x.subject.Contains(word));
}
it shows only the search result with the last word in sentence, but I thought the result must be narrowed down each time a word is used in the where line.
Try this:
foreach(string word in words)
{
var temp = word;
result = result.Where(x => x.subject.Contains(temp));
}
This is called (by ReSharper at least) "access to modified closure" - lambda expressions don't capture the value, they capture the entire variable. And the value of the variable word is changing with each iteration of the loop. So, since the Where() method is lazy-evaluated, by the time this sequence is consumed, the value of word is the last one in the sequence.
I hade some success by inverting the logic like this:
string[] words = enteredString.Split();
var results = from x in database.TableName
where words.Any(w => x.subject.Contains(w))
select x;
-- Edit
A more generic approach, for this kind of queries, would be:
class SearchQuery
{
public ICollection<string> Include { get; private set; }
public ICollection<string> Exclude { get; private set; }
}
[...]
SearchQuery query = new SearchQuery
{
Include = { "Foo" }, Exclude = { "Bar" }
}
var results = from x in database.Table
where query.Include.All(i => x.Subject.Contains(i)) &&
query.Exclude.All(i => !x.Subject.Contains(i))
select x;
This assumes that all words in query.Include must occur in Subject, if you want to find any subjects that have at least one of the words query.Include.All should be query.Include.Any
I've tested this with Entity Framework 4. Which will create a SQL query that applies all criteria in the database rather than in memory.
Here you go:
var result = from x in dataBase.tableName
select x;
string[] words = enteredString.Split();
result.Where(r => words.Any(w => r.Subject.Contains(w));
it can't do the thing - since with every word you are overwriting the previous result - you need to do something similar to:
List<object> AllResults = new List<object>();
foreach(string word in words)
{
var temp = word;
AllResults.AddRange (result.Where(x => x.subject.Contains(temp)).ToList());
}
Not sure what type your result type is hence the List<object>...

Get Count in List of instances contained in a string

I have a string containing up to 9 unique numbers from 1 to 9 (myString) e.g. "12345"
I have a list of strings {"1"}, {"4"} (myList) .. and so on.
I would like to know how many instances in the string (myString) are contained within the list (myList), in the above example this would return 2.
so something like
count = myList.Count(myList.Contains(myString));
I could change myString to a list if required.
Thanks very much
Joe
I would try the following:
count = mylist.Count(s => myString.Contains(s));
It is not perfectly clear what you need, but these are some options that could help:
myList.Where(s => s == myString).Count()
or
myList.Where(s => s.Contains(myString)).Count()
the first would return the number of strings in the list that are the same as yours, the second would return the number of strings that contain yours. If neither works, please make your question more clear.
If myList is just List<string>, then this should work:
int count = myList.Count(x => myString.Contains(x));
If myList is List<List<string>>:
int count = myList.SelectMany(x => x).Count(s => myString.Contains(s));
Try
count = myList.Count(s => s==myString);
This is one approach, but it's limited to 1 character matches. For your described scenario of numbers from 1-9 this works fine. Notice the s[0] usage which refers to the list items as a character. For example, if you had "12" in your list, it wouldn't work correctly.
string input = "123456123";
var list = new List<string> { "1", "4" };
var query = list.Select(s => new
{
Value = s,
Count = input.Count(c => c == s[0])
});
foreach (var item in query)
{
Console.WriteLine("{0} occurred {1} time(s)", item.Value, item.Count);
}
For multiple character matches, which would correctly count the occurrences of "12", the Regex class comes in handy:
var query = list.Select(s => new
{
Value = s,
Count = Regex.Matches(input, s).Count
});
try
var count = myList.Count(x => myString.ToCharArray().Contains(x[0]));
this will only work if the item in myList is a single digit
Edit: as you probably noticed this will convert myString to a char array multiple times so it would be better to have
var myStringArray = myString.ToCharArray();
var count = myList.Count(x => myStringArray.Contains(x[0]));

Categories