How to select distinct values from DB separated by comma? - c#

I have a front end including 2 columns, Keywords1 and keywords2 in data base they goes in a single field called keywords (separated by ,). Now I have a search screen which have a Keywords as auto complete text box, now in order populate it I need to get single values from DB, so I have something like,
Keywords
A
A
A,B
B,C
C,E
D,K
Now in order to populate them as a single listItem I need something like.
Keywords
A
B
C
D
k
So that front end doesn't contains and duplicate in it. I am not much expert in SQL, One way I know is just to get the distinct values from DB with like %entered keywords% and the use LINQ to separate them by comma and then get the distinct values. But that would be a lengthy path.
Any suggestion would be highly appreciated.
Thanks in advance.

Maybe a bit late, but an alternative answer that ends up with distinct keywords:
List<string> yourKeywords= new List<string>(new string[] { "A,B,C", "C","B","B,C" });
var splitted = yourKeywords
.SelectMany(item => item.Split(','))
.Distinct();
This will not work straight against the DB though. you would have to read the DB contents into memory before doing the SelectMany, since Split has not equivalent in SQL. It would then look like
var splitted = db.Keywords
.AsEnumerable()
.SelectMany(item => item.Split(','))
.Distinct();

Getting them by using string split and Linq group by
List<string> yourKeywords= new List<string>(new string[] { "A,B,C", "C","B","B,C" });
List<string> splitted = new List<string>();
yourKeywords.ForEach(x => splitted.AddRange(x.Split(',')));
var t = splitted.GroupBy(x => x);

Related

Lambda Function to find most popular word in a List C# [duplicate]

This question already has answers here:
How to Count Duplicates in List with LINQ
(7 answers)
Closed 2 years ago.
I currently have what I believe is a lambda function with C# (fairly new to coding & haven't used a lambda function before so go easy), which adds duplicate strings (From FilteredList) in a list and counts the number of occurrences and stores that value in count. I only want the most used word from the list which I've managed to do by the "groups.OrderBy()... etc) line, however I'm pretty sure that I've made this very complicated for myself and very inefficient. As well as by adding the dictionary and the key value pairs.
var groups =
from s in FilteredList
group s by s into g
// orderby g descending
select new
{
Stuff = g.Key,
Count = g.Count()
};
groups = groups.OrderBy(g => g.Count).Reverse().Take(1);
var dictionary = groups.ToDictionary(g => g.Stuff, g => g.Count);
foreach (KeyValuePair<string, int> kvp in dictionary)
{
Console.WriteLine("Key = {0}, Value = {1}", kvp.Key, kvp.Value);
}
Would someone please either help me through this and explain a little bit of this too me or at least point me in the direction of some learning materials which may help me better understand this.
For extra info: The FilteredList comes from a large piece of external text, read into a List of strings (split by delimiters), minus a list of string stop words.
Also, if this is not a lambda function or I've got any of the info in here incorrect, please kindly correct me so I can fix the question to be more relevant & help me find an answer.
Thanks in advance.
Yes, I think you have overcomplicated it somewhat.. Assuming your list of words is like:
var words = new[] { "what's", "the", "most", "most", "most", "mentioned", "word", "word" };
You can get the most mentioned word with:
words.GroupBy(w => w).OrderByDescending(g => g.Count()).First().Key;
Of course, you'd probably want to assign it to a variable, and presentationally you might want to break it into multiple lines:
var mostFrequentWord = words
.GroupBy(w => w) //make a list of sublists of words, like a dictionary of word:list<word>
.OrderByDescending(g => g.Count()) //order by sublist count descending
.First() //take the first list:sublist
.Key; //take the word
The GroupBy produces a collection of IGroupings, which is like a Dictionary<string, List<string>>. It maps each word (the key of the dictionary) to a list of all the occurrences of that word. In my example data, the IGrouping with the Key of "most" will be mapped to a List<string> of {"most","most","most"} which has the highest count of elements at 3. If we OrderByDescending the grouping based on the Count() of each of the lists then take the First, we'll get the IGrouping with a Key of "most", so all we need to do to retrieve the actual word is pull the Key out
If the word is just one of the properties of a larger object, then you can .GroupBy(o => o.Word). If you want some other property from the IGrouping such as its first or last then you can take that instead of the Key, but bear in mind that the property you end up taking might be different each time unless you enforce ordering of the list inside the grouping
If you want to make this more efficient than you can install MoreLinq and use MaxBy; getting the Max word By the count of the lists means you can avoid a sort operation. You could also avoid LINQ and use a dictionary:
string[] words = new[] { "what", "is", "the", "most", "most", "most", "mentioned", "word", "word" };
var maxK = "";
var maxV = -1;
var d = new Dictionary<string, int>();
foreach(var w in words){
if(!d.ContainsKey(w))
d[w] = 0;
d[w]++;
if(d[w] > maxV){
maxK = w;
maxV = d[w];
}
}
Console.WriteLine(maxK);
This keeps a dictionary that counts words as it goes, and will be more efficient than the LINQ route as it needs only a single pass of the word list, plus the associated dictionary lookups in contrast to "convert wordlist to list of sublists, sort list of sublists by sublist count, take first list item"
This should work:
var mostPopular = groups
.GroupBy(item => new {item.Stuff, item.Count})
.Select(g=> g.OrderByDescending(x=> x.Count).FirstOrDefault())
.ToList();
OrderByDescending along with .First() combines your usage of OrderBy, Reverse() and Take.
First part is a Linq operation to read the groups from the FilteredList.
var groups =
from s in FilteredList
group s by s into g
// orderby g descending
select new
{
Stuff = g.Key,
Count = g.Count()
};
The Lambda usage starts when the => signal is used. Basically means it's going to be computed at run time and an object of that type/format is to be created.
Example on your code:
groups = groups.OrderBy(g => g.Count).Reverse().Take(1);
Reading this, it is going to have an object 'g' that represents the elements on 'groups' with a property 'Count'. Being a list, it allows the 'Reverse' to be applied and the 'Take' to get the first element only.
As for documentation, best to search inside Stack Overflow, please check these links:
C# Lambda expressions: Why should I use them? - StackOverflow
Lambda Expressions in C# - external
Using a Lambda Expression Over a List in C# - external
Second step: if the data is coming from an external source and there are no performance issues, you can leave the code to refactor onwards. A more detail data analysis needs to be made to ensure another algorithm works.

LINQ to select values from a db

I was wondering it is possible to use LINQ to select values from a list with some logic e.g. if I had postcodes EC1V 2DD, EC1M 51D..... how would I use var list.Select(rs => rs.Postcodes.Select()).ToList() to create a list of postcodes with only the first characters before the space? Thanks!
I made a list of postcodes for testing purposes
And I split the the space and select the first element, of course you need to validate when doesn't have any space or something like that. Hope this can help you.
List<string> postcodes = new List<string>();
postcodes.Add("EC1V 2DD");
postcodes.Add("EC1M 51D");
var query = postcodes.Select(x => x.Split(' ')[1]).ToList();
foreach(var item in query)
{
Console.WriteLine(item);
}

Get Distinct String List From DataTable

To reduce the amount of code, is it possible to combine this in one line - where I convert a DataTable column into a string list, but I only want the distinct items in that list (there are multiple columns, so sometimes columns will have multiple values, where one won't):
List<string> column1List = returnDataTable.AsEnumerable().Select(x => x["Column1"].ToString()).ToList();
var distinctColumn1 = (from distinct1 in column1List select distinct1).Distinct();
The above works, but is an extra line. Since the distinct is an option on the list, I did try:
List<string> column1List = (returnDataTable.AsEnumerable().Select(x => x["Column1"].ToString()).ToList()).Distinct();
However, that errors, so it appears that distinct can't be called on a list being converted from a DataTable (?).
Just curious if it's possible to convert a DataTable into a string list and only get the distinct values in one line. May not be possible.
Distinct returns IEnumerable<TSource> in your case it returns IEnumerable<String> and you are trying to get the List<String> in the output.
You need to change the code from
List<string> column1List = (returnDataTable.AsEnumerable().Select(x => x["Column1"].ToString()).ToList()).Distinct();
List<string> column1List = (returnDataTable.AsEnumerable().Select(x => x["Column1"].ToString()).Distinct().ToList();
Using System.Linq you can use something like this
my_enumerable.GroupBy(x => x.Column1).Select(x => x.First).ToList()

How can i split and get distinct words in a list?

My sample data coloumn, which come from an CSV file is
|----Category------------|
SHOES
SHOES~SHOCKS
SHOES~SHOCKS~ULTRA SOCKS
I would love to split the specific column and get the distinct values in a list like
SHOES
SHOCKS
ULTRA SOCKS
I tried the following, but it does not work as expected.
var test = from c in products select c.Category.Split('~').Distinct().ToList();
It actually returns the following.
Any thoughts please? Thank you.
I would use SelectMany to "flatten" the list before removing duplicates:
products.SelectMany(c => c.Category.Split('~'))
.Distinct()
You can use SelectMany to flatten the collection:
products.SelectMany(p => p.Category.Split('~')).Distinct().ToList();
You were close, you just needed to flatten out your collection to pull the individual items of each grouping via a SelectMany() call :
// The SelectMany will map the results of each of your Split() calls
// into a single collection (instead of multiple)
var test = products.SelectMany(p => p.Category.Split('~'))
.Distinct()
.ToList();
You can see a complete working example demonstrated here and seen below :
// Example input
var input = new string[] { "SHOES","SHOES~SHOCKS","SHOES~SHOCKS~ULTRA SOCKS" };
// Get your results (yields ["SHOES","SHOCKS","ULTRA SOCKS"])
var output = input.SelectMany(p => p.Split('~'))
.Distinct()
.ToList();
Merge this list of list of strings into a single list by using SelectMany() and Just add another Distinct to your List..
var test = from c in products select c.Category.Split('~').Distinct().ToList().SelectMany(x => x).Distinct().ToList();
Here's how you'd do it in query syntax.
var test = (from p in products
from item in p.Category.Split('~')
select item).Distinct().ToList();

How way to operate on single items at an index in group in IGrouping?

I have simplified groupby code below, where I know there exists at most two records for each group, with each group being grouped by the value at index two in string a array. I want to iterate through the list of keys in the IGrouping, and combine some values in each Group then add that result to a final list, but I am new to LINQ so don't exactly know how to access these first and/or second values at an index.
Can anyone shed some light on how to do this?
each Group derived from var lines is something like this:
key string[]
---- -------------
123 A, stuff, stuff
123 B, stuff, stuff
and I want the result to be a string[] that combines elements of each group in the "final" list like:
string[]
-------
A, B
my code:
var lines = File.ReadAllLines(#path).Skip(1).Select(r => r.Split('\t')).ToList();
List<string[]> final = new List<string[]>();
var groups = lines.GroupBy(r => r[2]);
foreach (var pairs in groups)
{
// I need to combine items in each group here; maybe a for() statement would be better so I can peek ahead??
foreach (string[] item in pairs)
{
string[] s = new string[] { };
s[0] = item[0];
s[1] = second item in group - not sure what to do here or if I am going aboout this the right way
final.Add(s);
}
}
There's not too much support on the subject either, so I figured it may be helpful to somebody.
It sounds like all you're missing is calling ToList or ToArray on the group:
foreach (var group in groups)
{
List<string[]> pairs = group.ToList();
// Now you can access pairs[0] for the first item in the group,
// pairs[1] for the second item, pairs.Count to check how many
// items there are, or whatever.
}
Or you could avoid creating a list, and call Count(), ElementAt(), ElementAtOrDefault() etc on the group.
Now depending on what you're actually doing in the body of your nested foreach loop (it's not clear, and the code you've given so far won't work because you're trying to assign a value into an empty array) you may be able to get away with:
var final = lines.GroupBy(r => r[2])
.Select(g => ...)
.ToList()
where the ... is "whatever you want to do with a group". If you can possibly do that, it would make the code a lot clearer.
With more information in the question, it looks like you want just:
var final = lines.GroupBy(r => r[2])
.Select(g => g.Select(array => array[0]).ToArray())
.ToList()

Categories