How to do in this in Linq C# - c#

So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.

If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));

var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?

Related

How to modify string list for duplicate values?

I am working on project which is asp.net mvc core. I want to replace string list of duplicate values to one with comma separated,
List<string> stringList = surveylist.Split('&').ToList();
I have string list
This generate following output:
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF
1003=ABC
1=JKL
And I want to change output like this
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF,ABC
1=JKL
Duplicate items values should be comma separated.
There are probably 20 ways to do this. One simple one would be:
List<string> newStringList = stringList
.Select(a => new { KeyValue = a.Split("=") })
.GroupBy(a => a.KeyValue[0])
.Select(a => $"{a.Select(x => x.KeyValue[0]).First()}={string.Join(",", a.Select(x => x.KeyValue[1]))}")
.ToList();
Take a look at your output. Notice that an equal sign separates each string into a key-value pair. Think about how you want to approach this problem. Is a list of strings really the structure you want to build on? You could take a different approach and use a list of KeyValuePairs or a Dictionary instead.
If you really need to do it with a List, then look at the methods LINQ's Enumerable has to offer. Namely Select and GroupBy.
You can use Select to split once more on the equal sign: .Select(s => s.Split('=')).
You can use GroupBy to group values by a key: .GroupBy(pair => pair[0]).
To join it back to a string, you can use a Select again.
An end result could look something like this:
List<string> stringList = values.Split('&')
.Select(s => {
string[] pair = s.Split('=');
return new { Key = pair[0], Value = pair[1] };
})
.GroupBy(pair => pair.Key)
.Select(g => string.Concat(
g.Key,
'=',
string.Join(
", ",
g.Select(pair => pair.Value)
)
))
.ToList();
The group contains pairs so you need to select the value of each pair and join them into a string.

LINQ returns List<{int,double}> two values after .selected but I need List<int> with one value only

I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.
You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return
You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();

Remove every first element of grouped collection

I have a collection of elements and some of these elements are duplicating. I need to extract all records but only the first record if the record is one of a duplicate set.
I was able to group the elements and find all elements that have duplicates, but how to remove every first element of a group?
var records =
dbContext.Competitors
.GroupBy(x => x.Email)
.Select(x => new { Properties = x,
Count = x.Key.Count() })
.Where(x => x.Count > 1)
.ToList();
EDIT: Seems like it's impossible to accomplish this task with EF, because it fails to translate the desired linq expression to SQL. I'll be happy if someone offer different approach.
To exclude the first record from each email-address group with more than one entry, you could do this:
var records = dbContext.Competitors
.GroupBy(x => x.Email)
.SelectMany(x => (x.Count() == 1) ? x : x.OrderBy(t=>t).Skip(1))
.ToList();
This is the logic :
Group by a property > Select every Group > (Possibly) Sort that > Skip first one
This can be turned into some linq code like this :
//use SelectMany to flat the array
var x = list.GroupBy(g => g.Key).Select(grp => grp.Skip(1)).SelectMany(i => i);

Modifying an IEnumerable type

I have a a string IEnumerable type that I get from the below code.The var groups is an Enumerable type which has some string values. Say there are 4 values in groups and in the second position the value is just empty string "" .The question is how can I move it to the 4th ie the end position.I do not want to sort or change any order.Just move the empty "" value whereever it occurs to the last position.
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct();
Simply order the results by their string value:
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct().OrderByDescending(s => s);
Edit (following OP edit):
List<Item> Items = somefunction();
var groups = Items.Select(g => g.Category).Distinct();
groups = groups.Where(s => !String.IsNullOrEmpty(s))
.Concat(groups.Where(s => String.IsNullOrEmpty(s)));
You can't directly modify the IEnumerable<> instance, but you can create a new one:
var list = groups.Where(x => x != "").Concat(groups.Where(x => x == ""));
Note that in this query, groups is iterated twice. This is usually not a good practice for a deferred IEnumerable<>, so you should call ToList() after the Distinct() to eagerly evaluate your LINQ query:
var groups = Items.Select(g => g.Category).Distinct().ToList();
EDIT :
On second thought, there's a much easier way to do this:
var groups = Items.Select(g => g.Category).Distinct().OrderBy(x => x == "");
Note that this doesn't touch the order of the non-empty elements since OrderBy is stable.
var groups = Items.Select(g => g.Category).Distinct().OrderByDescending(s =>s);
I don't like my query but it should do the job. It selects all items which are not empty and unions it with the items which are empty.
var groups = Items.Select(g => g.Category).Distinct()
.Where(s => !string.IsNullOrEmpty(s))
.Union(Items.Select(g => g.Category).Distinct()
.Where(s => string.IsNullOrEmpty(s)));
Try something like
var temp = groups.Where(item => ! String.IsNullOrEmpty(item)).ToList<string>();
while (temp.Count < groups.Count) temp.Add("");

Counting grouped data with Linq to Sql

I have a database of documents in an array, each with an owner and a document type, and I'm trying to get a list of the 5 most common document types for a specific user.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id);
This returns all the documents belonging to a specific owner and grouped as I need them, I now need a way to extract the ids of the most common document types. I'm not too familiar with Linq to Sql, so any help would be great.
This would order the groups by count descending and then take the top 5 of them, you could adapt to another number or completely take out the Take() if its not needed in your case:
var mostCommon = docTypes.OrderByDescending( x => x.Count()).Take(5);
To just select the top document keys:
var mostCommonDocTypes = docTypes.OrderByDescending( x => x.Count())
.Select( x=> x.Key)
.Take(5);
You can also of course combine this with your original query by appending/chaining it, just separated for clarity in this answer.
Using the Select you can get the value from the Key of the Grouping (the Id) and then a count of each item in the grouping.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id)
.Select(groupingById=>
new
{
Id = groupingById.Key,
Count = groupingById.Count(),
})
.OrderByDescending(x => x.Count);

Categories