How to find a match with 2 comma separated strings with LINQ - c#

I am new to LINQ.
I am trying to compare 2 comma separated strings to see if they contain a matching value.
I have a string that contains a list of codes.
masterFormList = "AAA,BBB,CCC,FFF,GGG,HHH"
I am trying to compare it to a list of objects. In a given field FormCode contains a comma separated string of codes. I want to see if this at lease one code in this string is in the masterFormList.
How would I write linq to accomplish this?
Right now I have:
resultsList = (from r in resultsList
where r.FormCodes.Split(',').Contains(masterFormList)
select r).ToList();
It does not return any matching items from the list.
Please advise

You'd need to build a collection of the items to search for, then check to see if there are any contained within that set:
var masterSet = new HashSet<string>(masterFormList.Split(','));
resultsList = resultsList
.Where(r => r.FormCodes.Split(',')
.Any(code => masterSet.Contains(code)))
.ToList();

var masterFormList = "AAA,BBB,CCC,FFF,GGG,HHH";
var otherList = "XXX,BBB,YYY";
bool match = otherList.Split(',').Intersect(masterFormList.Split(',')).Any();
or if you want the matching items
var matches = otherList.Split(',').Intersect(masterFormList.Split(',')).ToList();

To answer the question as stated, this will find all the matches between two strings:
var matches =
from masterCode in masterFormList.Split(',')
join formCode in formCodes.Split(',') on masterCode equals formCode
select formCode;
foreach (string match in matches)
{
Console.WriteLine(match);
}
but that's overkill if all you want to know is that one exists. You could just do this with the same query:
Console.WriteLine(matches.Any());
However, that's likely to do more work than strictly necessary. A modification to Reed Copsey's answer might be simplest (if we're looking to answer the question in the title of your post):
var masterSet = new HashSet<string>(masterFormList.Split(','));
bool atLeastOneMatch = formCodes.Split(',').Any(c => masterSet.Contains(c));
While these are reasonably idiomatic LINQ solutions to the problem you stated ("I am trying to compare 2 comma separated strings to see if they contain a matching value") they're probably not a great match for what you actually seem to want, which is to take a list of objects, and find just the ones in which a particular property meets your criteria. And a join is probably the wrong approach for that because it looks rather unwieldy:
resultList =
(from formItem in resultList
from code in formItem.FormCodes.Split(',')
join masterCode in masterFormList.Split(',') on code equals masterCode
group code by formItem into matchGroup
select matchGroup.Key)
.ToList();
or if you prefer:
resultList =
(from formItem in resultList
from code in formItem.FormCodes.Split(',')
join masterCode in masterFormList.Split(',') on code equals masterCode into matchGroup
where matchGroup.Any()
select formItem)
.Distinct()
.ToList();
These solutions have little to commend them...
So given the problem evident from your code (as opposed to the problem defined in the question title and the first 3 paragraphs of your post), Reed Copsey's solution is better.
The one tweak I'd make is that if your master set is fixed, you'd only want to build that HashSet<string> once, to amortize the costs. So either you'd put it in a static field:
private readonly static HashSet<string> masterSet =
new HashSet<tring>(masterFormList.Split(',');
or use Lazy<T> to create it it on demand.
(Edited 8/8/2013 after Reed pointed out to me in the comments that the problem evident from the code example was not the same as the problem stated in the question.)

Related

Lambda Function to find most popular word in a List C# [duplicate]

This question already has answers here:
How to Count Duplicates in List with LINQ
(7 answers)
Closed 2 years ago.
I currently have what I believe is a lambda function with C# (fairly new to coding & haven't used a lambda function before so go easy), which adds duplicate strings (From FilteredList) in a list and counts the number of occurrences and stores that value in count. I only want the most used word from the list which I've managed to do by the "groups.OrderBy()... etc) line, however I'm pretty sure that I've made this very complicated for myself and very inefficient. As well as by adding the dictionary and the key value pairs.
var groups =
from s in FilteredList
group s by s into g
// orderby g descending
select new
{
Stuff = g.Key,
Count = g.Count()
};
groups = groups.OrderBy(g => g.Count).Reverse().Take(1);
var dictionary = groups.ToDictionary(g => g.Stuff, g => g.Count);
foreach (KeyValuePair<string, int> kvp in dictionary)
{
Console.WriteLine("Key = {0}, Value = {1}", kvp.Key, kvp.Value);
}
Would someone please either help me through this and explain a little bit of this too me or at least point me in the direction of some learning materials which may help me better understand this.
For extra info: The FilteredList comes from a large piece of external text, read into a List of strings (split by delimiters), minus a list of string stop words.
Also, if this is not a lambda function or I've got any of the info in here incorrect, please kindly correct me so I can fix the question to be more relevant & help me find an answer.
Thanks in advance.
Yes, I think you have overcomplicated it somewhat.. Assuming your list of words is like:
var words = new[] { "what's", "the", "most", "most", "most", "mentioned", "word", "word" };
You can get the most mentioned word with:
words.GroupBy(w => w).OrderByDescending(g => g.Count()).First().Key;
Of course, you'd probably want to assign it to a variable, and presentationally you might want to break it into multiple lines:
var mostFrequentWord = words
.GroupBy(w => w) //make a list of sublists of words, like a dictionary of word:list<word>
.OrderByDescending(g => g.Count()) //order by sublist count descending
.First() //take the first list:sublist
.Key; //take the word
The GroupBy produces a collection of IGroupings, which is like a Dictionary<string, List<string>>. It maps each word (the key of the dictionary) to a list of all the occurrences of that word. In my example data, the IGrouping with the Key of "most" will be mapped to a List<string> of {"most","most","most"} which has the highest count of elements at 3. If we OrderByDescending the grouping based on the Count() of each of the lists then take the First, we'll get the IGrouping with a Key of "most", so all we need to do to retrieve the actual word is pull the Key out
If the word is just one of the properties of a larger object, then you can .GroupBy(o => o.Word). If you want some other property from the IGrouping such as its first or last then you can take that instead of the Key, but bear in mind that the property you end up taking might be different each time unless you enforce ordering of the list inside the grouping
If you want to make this more efficient than you can install MoreLinq and use MaxBy; getting the Max word By the count of the lists means you can avoid a sort operation. You could also avoid LINQ and use a dictionary:
string[] words = new[] { "what", "is", "the", "most", "most", "most", "mentioned", "word", "word" };
var maxK = "";
var maxV = -1;
var d = new Dictionary<string, int>();
foreach(var w in words){
if(!d.ContainsKey(w))
d[w] = 0;
d[w]++;
if(d[w] > maxV){
maxK = w;
maxV = d[w];
}
}
Console.WriteLine(maxK);
This keeps a dictionary that counts words as it goes, and will be more efficient than the LINQ route as it needs only a single pass of the word list, plus the associated dictionary lookups in contrast to "convert wordlist to list of sublists, sort list of sublists by sublist count, take first list item"
This should work:
var mostPopular = groups
.GroupBy(item => new {item.Stuff, item.Count})
.Select(g=> g.OrderByDescending(x=> x.Count).FirstOrDefault())
.ToList();
OrderByDescending along with .First() combines your usage of OrderBy, Reverse() and Take.
First part is a Linq operation to read the groups from the FilteredList.
var groups =
from s in FilteredList
group s by s into g
// orderby g descending
select new
{
Stuff = g.Key,
Count = g.Count()
};
The Lambda usage starts when the => signal is used. Basically means it's going to be computed at run time and an object of that type/format is to be created.
Example on your code:
groups = groups.OrderBy(g => g.Count).Reverse().Take(1);
Reading this, it is going to have an object 'g' that represents the elements on 'groups' with a property 'Count'. Being a list, it allows the 'Reverse' to be applied and the 'Take' to get the first element only.
As for documentation, best to search inside Stack Overflow, please check these links:
C# Lambda expressions: Why should I use them? - StackOverflow
Lambda Expressions in C# - external
Using a Lambda Expression Over a List in C# - external
Second step: if the data is coming from an external source and there are no performance issues, you can leave the code to refactor onwards. A more detail data analysis needs to be made to ensure another algorithm works.

How to search any items of a list within a string using Linq

I am going to write search query. In that query I have a list of string like this:
var listWords=new List<string>(){"Hello","Home","dog","what"};
and also I have list of customers. How Can I search if customer's Name contains at least one of items in listWords:
Jack Home
Hot dog
what a big dog
Tried:
var prodList = events.Where(x =>listWords.IndexOf(x.Name.Trim().ToLower()) != -1).ToList();
Use .Where and .Any:
var result = events.Where(c => listWords.Any(w => c.Name.Contains(w)));
Problem with your solution is that you are converting your string to lower case but the collection of words has characters in upper case. In that case it will not find a match:
How can I make Array.Contains case-insensitive on a string array?

Finding the list of common objects between two lists

I have list of objects of a class for example:
class MyClass
{
string id,
string name,
string lastname
}
so for example: List<MyClass> myClassList;
and also I have list of string of some ids, so for example:
List<string> myIdList;
Now I am looking for a way to have a method that accept these two as paramets and returns me a List<MyClass> of the objects that their id is the same as what we have in myIdList.
NOTE: Always the bigger list is myClassList and always myIdList is a smaller subset of that.
How can we find this intersection?
So you're looking to find all the elements in myClassList where myIdList contains the ID? That suggests:
var query = myClassList.Where(c => myIdList.Contains(c.id));
Note that if you could use a HashSet<string> instead of a List<string>, each Contains test will potentially be more efficient - certainly if your list of IDs grows large. (If the list of IDs is tiny, there may well be very little difference at all.)
It's important to consider the difference between a join and the above approach in the face of duplicate elements in either myClassList or myIdList. A join will yield every matching pair - the above will yield either 0 or 1 element per item in myClassList.
Which of those you want is up to you.
EDIT: If you're talking to a database, it would be best if you didn't use a List<T> for the entities in the first place - unless you need them for something else, it would be much more sensible to do the query in the database than fetching all the data and then performing the query locally.
That isn't strictly an intersection (unless the ids are unique), but you can simply use Contains, i.e.
var sublist = myClassList.Where(x => myIdList.Contains(x.id));
You will, however, get significantly better performance if you create a HashSet<T> first:
var hash = new HashSet<string>(myIdList);
var sublist = myClassList.Where(x => hash.Contains(x.id));
You can use a join between the two lists:
return myClassList.Join(
myIdList,
item => item.Id,
id => id,
(item, id) => item)
.ToList();
It is kind of intersection between two list so read it like i want something from one list that is present in second list. Here ToList() part executing the query simultaneouly.
var lst = myClassList.Where(x => myIdList.Contains(x.id)).ToList();
you have to use below mentioned code
var samedata=myClassList.where(p=>p.myIdList.Any(q=>q==p.id))
myClassList.Where(x => myIdList.Contains(x.id));
Try
List<MyClass> GetMatchingObjects(List<MyClass> classList, List<string> idList)
{
return classList.Where(myClass => idList.Any(x => myClass.id == x)).ToList();
}
var q = myClassList.Where(x => myIdList.Contains(x.id));

Identify items in one list not in another of a different type

I need to identify items from one list that are not present in another list. The two lists are of different entities (ToDo and WorkshopItem). I consider a workshop item to be in the todo list if the Name is matched in any of the todo list items.
The following does what I'm after but find it awkward and hard to understand each time I revisit it. I use NHibernate QueryOver syntax to get the two lists and then a LINQ statement to filter down to just the Workshop items that meet the requirement (DateDue is in the next two weeks and the Name is not present in the list of ToDo items.
var allTodos = Session.QueryOver<ToDo>().List();
var twoWeeksTime = DateTime.Now.AddDays(14);
var workshopItemsDueSoon = Session.QueryOver<WorkshopItem>()
.Where(w => w.DateDue <= twoWeeksTime).List();
var matches = from wsi in workshopItemsDueSoon
where !(from todo in allTodos
select todo.TaskName)
.Contains(wsi.Name)
select wsi;
Ideally I'd like to have just one NHibernate query that returns a list of WorkshopItems that match my requirement.
I think I've managed to put together a Linq version of the answer put forward by #CSL and will mark that as the accepted answer as it put me in the direction of the following.
var twoWeeksTime = DateTime.Now.AddDays(14);
var subquery = NHibernate.Criterion.QueryOver.Of<ToDo>().Select(t => t.TaskName);
var matchingItems = Session.QueryOver<WorkshopItem>()
.Where(w => w.DateDue <= twoWeeksTime &&
w.IsWorkshopItemInProgress == true)
.WithSubquery.WhereProperty(x => x.Name).NotIn(subquery)
.Future<WorkshopItem>();
It returns the results I'm expecting and doesn't rely on magic strings. I'm hesitant because I don't fully understand the WithSubquery (and whether inlining it would be a good thing). It seems to equate to
WHERE WorkshopItem.Name IS NOT IN (subquery)
Also I don't understand the Future instead of List. If anyone would shed some light on those that would help.
I am not 100% sure how to achieve what you need using LINQ so to give you an option I am just putting up an alternative solution using nHibernate Criteria (this will execute in one database hit):
// Create a query
ICriteria query = Session.CreateCriteria<WorkShopItem>("wsi");
// Restrict to items due within the next 14 days
query.Add(Restrictions.Le("DateDue", DateTime.Now.AddDays(14));
// Return all TaskNames from Todo's
DetachedCriteria allTodos = DetachedCriteria.For(typeof(Todo)).SetProjection(Projections.Property("TaskName"));
// Filter Work Shop Items for any that do not have a To-do item
query.Add(SubQueries.PropertyNotIn("Name", allTodos);
// Return results
var matchingItems = query.Future<WorkShopItem>().ToList()
I'd recommend
var workshopItemsDueSoon = Session.QueryOver<WorkshopItem>()
.Where(w => w.DateDue <= twoWeeksTime)
var allTodos = Session.QueryOver<ToDo>();
Instead of
var allTodos = Session.QueryOver<ToDo>().List();
var workshopItemsDueSoon = Session.QueryOver<WorkshopItem>()
.Where(w => w.DateDue <= twoWeeksTime).List();
So that the collection isn't iterated until you need it to be.
I've found that it's helpfull to use linq extension methods to make subqueries more readable and less awkward.
For example:
var matches = from wsi in workshopItemsDueSoon
where !allTodos.Select(it=>it.TaskName).Contains(wsi.Name)
select wsi
Personally, since the query is fairly simple, I'd prefer to do it like so:
var matches = workshopItemsDueSoon.Where(wsi => !allTodos.Select(it => it.TaskName).Contains(wsi.Name))
The latter seems less verbose to me.

How to select distinct values from DB separated by comma?

I have a front end including 2 columns, Keywords1 and keywords2 in data base they goes in a single field called keywords (separated by ,). Now I have a search screen which have a Keywords as auto complete text box, now in order populate it I need to get single values from DB, so I have something like,
Keywords
A
A
A,B
B,C
C,E
D,K
Now in order to populate them as a single listItem I need something like.
Keywords
A
B
C
D
k
So that front end doesn't contains and duplicate in it. I am not much expert in SQL, One way I know is just to get the distinct values from DB with like %entered keywords% and the use LINQ to separate them by comma and then get the distinct values. But that would be a lengthy path.
Any suggestion would be highly appreciated.
Thanks in advance.
Maybe a bit late, but an alternative answer that ends up with distinct keywords:
List<string> yourKeywords= new List<string>(new string[] { "A,B,C", "C","B","B,C" });
var splitted = yourKeywords
.SelectMany(item => item.Split(','))
.Distinct();
This will not work straight against the DB though. you would have to read the DB contents into memory before doing the SelectMany, since Split has not equivalent in SQL. It would then look like
var splitted = db.Keywords
.AsEnumerable()
.SelectMany(item => item.Split(','))
.Distinct();
Getting them by using string split and Linq group by
List<string> yourKeywords= new List<string>(new string[] { "A,B,C", "C","B","B,C" });
List<string> splitted = new List<string>();
yourKeywords.ForEach(x => splitted.AddRange(x.Split(',')));
var t = splitted.GroupBy(x => x);

Categories