Hashtable Duplicates - c#

i have this hashtable that i am converting to dictionary on the same line
Hashtable ids = new Hashtable();
ids = new Hashtable(_AppContext.TBL_PERSON.Where(oItem => oItem.DELETED == false).ToDictionary(o => o.CODE.ToUpper(), o => o.PERSON_ID));
thing is i am getting an error
"An item with the same key has already been added."
after Checking rows it turns out that CODE column has same row value multiple times.
is there a way to select only first value that occurs like First() but without making it first datatable then changing it to hashtable ?

Sure - use GroupBy and then pick the First() object in the group for the value:
ids = new Hashtable(
_AppContext.TBL_PERSON.Where(oItem => oItem.DELETED == false)
.GroupBy(o => o.CODE.ToUpper())
.ToDictionary(g => g.Key, g => g.First().PERSON_ID)
);
Keep in mind this gives you a HashTable of KeyValuePair<T,U> objects, which seems odd. If you just want the dictionary you can still use GroupBy and just remove the outer HashTable creation.

Related

LINQ returns List<{int,double}> two values after .selected but I need List<int> with one value only

I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.
You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return
You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();

Cannot add same key to dictionary more than once

Here is my code:
IEnumerable<ServiceTicket> troubletickets = db.ServiceTickets.Include(t => t.Company).Include(t => t.UserProfile);
var ticketGroups = new Dictionary<string, List<ServiceTicket>>();
ticketGroups = troubletickets
.GroupBy(o => o.DueDate).ToDictionary(
group => {
var firstOrDefault = #group.FirstOrDefault();
return firstOrDefault != null
? firstOrDefault.DueDate.HasValue
? firstOrDefault.DueDate.Value.ToShortDateString()
: ""
: "";
},
group => group.ToList()
).OrderBy(g => g.Key).ToDictionary(g => g.Key, g => g.Value);
The error that I am getting is: 'An item with the same key has already been added.' This is because the DueDate value is occasionally repeated. My question is how can I keep the key from being added if it already exists in the dictionary?
It seems that you are grouping by one value (the DueDate value), but using a different value as the dictionary key.
Can you not just use the custom code for grouping instead?
ticketGroups = troubletickets
.GroupBy(o => o.DueDate.HasValue
? o.DueDate.Value.ToShortDateString()
: "")
.ToDictionary(g => g.Key, g => g.ToList());
Note that I took our the superfluous OrderBy and second ToDictionary call - I assumed you were trying to "order" the dictionary which won't work as a plain dictionary is not ordered.
You get duplicate keys because there are two ways to get an empty string as key, either an empty group, or an empty date. The duplicate will always be the empty string. I wonder if you really intended to get an empty string as key when the group is empty. Anyway, it's not necessary, you can always filter empty groups later.
It's easier to group by date (including null) first through the database engine and then apply string formatting in memory:
IQueryable<ServiceTicket> troubletickets = db.ServiceTickets
.Include(t => t.Company)
.Include(t => t.UserProfile);
Dictionary<string, List<ServiceTicket>> ticketGroups =
troubletickets
.GroupBy(ticket => ticket.DueDate)
.AsEnumerable() // Continue in memory
.ToDictionary(g => g.Key.HasValue
? g.Key.Value.ToShortDateString()
: string.Empty,
g => g.Select(ticket => ticket));
Now the grouping is by the Key value, not by the First element in the group. The Key is never null, it's always a Nullable<DateTime>, with or without a value.
Side note: you'll notice that EF will not generate a SQL group by statement, that's because the SQL statement is "destructive": it only returns grouped columns and aggregate data, not the individual records that a LINQ GroupBy does return. For this reason, the generated SQL is pretty bloated and it may enhance performance if you place the AsEnumerable before the .GroupBy.

Converting Collection of Strings to Dictionary

This is probably a simple question, but the answer is eluding me.
I have a collection of strings that I'm trying to convert to a dictionary.
Each string in the collection is a comma-separated list of values that I obtained from a regex match. I would like the key for each entry in the dictionary to be the fourth element in the comma-separated list, and the corresponding value to be the second element in the comma-separated list.
When I attempt a direct call to ToDictionary, I end up in some kind of loop that appears to kick me of the BackgroundWorker thread I'm in:
var MoveFromItems = matches.Cast<Match>()
.SelectMany(m => m.Groups["args"].Captures
.Cast<Capture>().Select(c => c.Value));
var dictionary1 = MoveFromItems.ToDictionary(s => s.Split(',')[3],
s => s.Split(',')[1]);
When I create the dictionary manually, everything works fine:
var MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
if(!MoveFroms.ContainsKey(sKey))
MoveFroms[sKey.ToUpper()] = sVal;
}
I appreciate any help you might be able to provide.
The problem is most likely that the keys have duplicates. You have three options.
Keep First Entry (This is what you're currently doing in the foreach loop)
Keys only have one entry, the first one that shows up - meaning you can have a Dictionary:
var first = MoveFromItems.Select(x => x.Split(','))
.GroupBy(x => x[3])
.ToDictionary(x => x.Key, x => x.First()[1]);
Keep All Entries, Grouped
Keys will have more than one entry (each key returns an Enumerable), and you use a Lookup instead of a Dictionary:
var lookup = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);
Keep All Entries, Flattened
No such thing as a key, simply a flattened list of entries:
var flat = MoveFromItems.Select(x => x.Split(','))
.Select(x => new KeyValuePair<string,string>(x[3], x[1]));
You could also use a tuple here (Tuple.Create(x[3], x[1]);) instead.
Note: You will need to decide where/if you want the keys to be upper or lower case in these cases. I haven't done anything related to that yet. If you want to store the key as upper, just change x[3] to x[3].ToUpper() in everything above.
This splits each item and selects key out of the 4th split-value, and value out of the 2nd split-value, all into a dictionary.
var dictionary = MoveFromItems.Select(s => s.Split(','))
.ToDictionary(split => split[3],
split => split[1]);
There is no point in splitting the string twice, just to use different indices.
This would be just like saving the split results into a local variable, then using it to access index 3 and 1.
However, if indeed you don't know if keys might reoccur, I would go for the simple loop you've implemented, without a doubt.
Although you have a small bug in your loop:
MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
// sKey might not exist as a key
if (!MoveFroms.ContainsKey(sKey))
//if (!MoveFroms.ContainsKey(sKey.ToUpper()))
{
// but sKey.ToUpper() might exist!
MoveFroms[sKey.ToUpper()] = sVal;
}
}
Should do ContainsKey(sKey.ToUpper()) in your condition as well, if you really want the key all upper cases.
This will Split each string in MoveFromItems with ',' and from them make 4th item (3rd Index) as Key and 2nd item(1st Index) as Value.
var dict = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);

How can I order a Dictionary in C#?

edit: Thanks Jason, the fact that it was a dictionary isn't that important. I just wanted the runtime to have a low runtime. Is that LINQ method fast? Also, I know this is off topic but what does the n => n mean?
I have a list of numbers and I want to make another list with the numbers that appear most at the beginning and the least at the end.
So what I did was when through the list and checked if the number x was in the dictionary. If it wasn't then I made the key x and the value one. If it was then I changed the value to be the value plus one.
Now I want to order the dictionary so that I can make a list with the ones that appear the most at the beginning and the least at the end.
How can I do that in C#?
ps. runtime is very important.
So it sounds like you have a Dictionary<int, int> where the key represents some integer that you have in a list and corresponding value represents the count of the number of times that integer appeared. You are saying that you want to order the keys by counts sorted in descending order by frequency. Then you can say
// dict is Dictionary<int, int>
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Now, it sounds like you started with a List<int> which are the values that you want to count and order by count. You can do this very quickly in LINQ like so:
// list is IEnumerable<int> (e.g., List<int>)
var ordered = list.GroupBy(n => n)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.ToList();
Or in query syntax
var ordered = (from n in list
group n by n into g
orderby g.Count() descending
select g.Key).ToList();
Now, if you need to have the intermediate dictionary you can say
var dict = list.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Use the GroupBy extension on IEnumerable() to group the numbers and extract the count of each. This creates the dictionary from the list and orders it in one statement.
var ordered = list.GroupBy( l => l )
.OrderByDescending( g => g.Count() )
.ToDictionary( g => g.Key, g.Count() );
You may also consider using SortedDictionary.
It sorts the items on the basis of key, while insertion. more..
List<KeyValuePair<type, type>> listEquivalent =
new List<KeyValuePair<type, type>>(dictionary);
listEquivalent.Sort((first,second) =>
{
return first.Value.CompareTo(second.Value);
});
Something like that maybe?
edit: Thanks Jason for the notice on my omission

Linq query

I'm a big noob with Linq and trying to learn, but I'm hitting a blocking point here.
I have a structure of type:
Dictionary<MyType, List<MyObj>>
And I would like to query with Linq that structure to extract all the MyObj instances that appear in more than one list within the dictionary.
What would such a query look like?
Thanks.
from myObjectList in myObjectDictionary.Values
from myObject in myObjectList.Distinct()
group myObject by myObject into myObjectGroup
where myObjectGroup.Skip(1).Any()
select myObjectGroup.Key
The Distinct() on each list ensures MyObj instances which repeat solely in the same list are not reported.
You could do something like this:
var multipleObjs =
MyObjDictionary.Values // Aggrigate all the List<MyObj> values into a single list
.SelectMany(list => list) // Aggrigate all the MyObjs from each List<MyObj> into a single IEnumerable
.GroupBy(obj => obj) // Group by the Obj itself (Or an ID or unique property on them if it exists)
.Where(group => group.Count() >= 2) // Filter out any group with less then 2 objects
.Select(group => group.Key); // Re-Select the objects using the key.
Edit
I Realized that this could also be read diffrently, such that it doesn't matter if the MyObj occurs multiple times in the same list, but only if it occurs multiple times in diffrent lists. In that case, when we are initally aggrigating the lists of MyObjs we can select Distinct values, or use a slightly diffrent query:
var multipleObjs =
MyObjDictionary.Values // Aggrigate all the List<MyObj> values into a single list
.SelectMany(v => v.Distinct()) // Aggrigate all distinct MyObjs from each List<MyObj> into a single IEnumerable
.GroupBy(obj => obj) // Group by the Obj itself (Or an ID or unique property on them if it exists)
.Where(group => group.Count() >= 2) // Filter out any group with less then 2 objects
.Select(group => group.Key); // Re-Select the objects using the key.
var multipleObjs =
MyObjDictionary.SelectMany(kvp => // Select from all the KeyValuePairs
kvp.Value.Where(obj =>
MyObjDictionary.Any(kvp2 => // Where any of the KeyValuePairs
(kvp.Key != kvp2.Key) // Is Not the current KeyValuePair
&& kvp.Value.Contains(obj)))); // And also contains the same MyObj.

Categories