am I using Dictionary wrong, it seems it too slow

am I using Dictionary wrong, it seems it too slow - c#

I've used VS profilier and noticed that ~40% of the time program spends in the lines below.
I'm using title1 and color1 because either Visual Studio or Resharper suggested to do so. Are there any perfomance issues in the code below?
Dictionary<Item, int> price_cache = new Dictionary<Item, int>();
....
string title1 = title;
string color1 = color;
if (price_cache.Keys.Any(item => item.Title == title && item.Color == color))
{
price = price_cache[price_cache.Keys.First(item => item.Title == title11 && item.Color == color1)];

The problem is that your Keys.Any method iterates through all keys in your dictionary to find if there is a match. After that, you use the First method to do the same thing again.
Dictionary is suited for operations when you already have the key and want to get the value fast. In that case, it will calculate the hash code of your key (Item, in your case) and use it to "jump" to the bucket where your item is stored.
First, you need to make your custom comparer to let the Dictionary know how to compare items.
class TitleColorEqualityComparer : IEqualityComparer<Item>
{
public bool Equals(Item a, Item b)
{
// you might also check for nulls here
return a.Title == b.Title &&
a.Color == b.Color;
}
public int GetHashCode(Item obj)
{
// this should be as much unique as possible,
// but not too complicated to calculate
int hash = 17;
hash = hash * 31 + obj.Title.GetHashCode();
hash = hash * 31 + obj.Color.GetHashCode();
return hash;
}
}
Then, instantiate your dictionary using your custom comparer:
Dictionary<Item, int> price_cache =
new Dictionary<Item, int>(new TitleColorEqualityComparer());
From this point on, you can simply write:
Item some_item = GetSomeItem();
price_cache[some_item] = 5; // to quickly set or change a value
or, to search the dictionary:
Item item = GetSomeItem();
int price = 0;
if (price_cache.TryGetValue(item, out price))
{
// we got the price
}
else
{
// there is no such key in the dictionary
}
[Edit]
And to emphasize again: never iterate the Keys property to look for a key. If you do that, you don't need a Dictionary at all, you can simply use a list and get same (even slightly better performance).

Try using an IEqualityComparer as shown in the sample code on this page: http://msdn.microsoft.com/en-us/library/ms132151.aspx and make it calculate the hash code based on the title and color.

As Jesus Ramos suggested (when he said use a different data structure), you could make the key a string that is a concatenation of the title and color, then concatenate the search string and look for that. It should be faster.
So a key could look like name1:FFFFFF (the name, a colon, then the hex of the color), then you would just format the search string the same way.

Replace your price_cache.Keys.Any() with price_cache.Keys.SingleOrDefault() and this way you can store the result in a variable, check for nullity and if not you already have the searched item instead of searching for it twice like you do here.

If you want fast access to your hashtable, you need to implement the GetHashCode and Equals functioning:
public class Item
{
.....
public override int GetHashCode()
{
return (this.color.GetHashCode() + this.title.GetHashCode())/2;
}
public override bool Equals(object o)
{
if (this == o) return true;
var item = o as Item;
return (item != null) && (item.color == color) && (item.title== title) ;
}
Access you dictionary like:
Item item = ...// create sample item
int price = 0;
price_cache.ContainsKey(item);
price_cache[item];
price_cache.TryGetValue(item, out price);

Related

If it possible accelerate From-Where-Select method?

I have two var of code:
first:
struct pair_fiodat {string fio; string dat;}
List<pair_fiodat> list_fiodat = new List<pair_fiodat>();
// list filled 200.000 records, omitted.
foreach(string fname in XML_files)
{
// get FullName and Birthday from file. Omitted.
var usersLookUp = list_fiodat.ToLookup(u => u.fio, u => u.dat); // create map
var dates = usersLookUp[FullName];
if (dates.Count() > 0)
{
foreach (var dt in dates)
{
if (dt == BirthDate) return true;
}
}
}
and second:
struct pair_fiodat {string fio; string dat;}
List<pair_fiodat> list_fiodat = new List<pair_fiodat>();
// list filled 200.000 records, omitted.
foreach(string fname in XML_files)
{
// get FullName and Birthday from file. Omitted.
var members = from s in list_fiodat where s.fio == FullName & s.dat == Birthdate select s;
if (members.Count() > 0 return true;
}
They make the same job - searching user by name and birthday.
The first one work very quick.
The second is very slowly (10x-50x)
Tell me please if it possible accelerate the second one?
I mean may be the list need in special preparing?
I tried sorting: list_fiodat_sorted = list_fiodat.OrderBy(x => x.fio).ToList();, but...

I skip your first test and change Count() to Any() (count iterate all list while any stop when there are an element)
public bool Test1(List<pair_fiodat> list_fiodat)
{
foreach (string fname in XML_files)
{
var members = from s in list_fiodat
where s.fio == fname & s.dat == BirthDate
select s;
if (members.Any())
return true;
}
return false;
}
If you want optimize something, you must leave comfortable things that offer the language to you because usually this things are not free, they have a cost.
For example, for is faster than foreach. Is a bit more ugly, you need two sentences to get the variable, but is faster. If you iterate a very big collection, each iteration sum.
LINQ is very powerfull and it's wonder work with it, but has a cost. If you change it for another "for", you save time.
public bool Test2(List<pair_fiodat> list_fiodat)
{
for (int i = 0; i < XML_files.Count; i++)
{
string fname = XML_files[i];
for (int j = 0; j < list_fiodat.Count; j++)
{
var s = list_fiodat[j];
if (s.fio == fname & s.dat == BirthDate)
{
return true;
}
}
}
return false;
}
With normal collections there aren't difference and usually you use foeach, LINQ... but in extreme cases, you must go to low level.
In your first test, ToLookup is the key. It takes a long time. Think about this: you are iterating all your list, creating and filling the map. It's bad in any case but think about the case in which the item you are looking for is at the start of the list: you only need a few iterations to found it but you spend time in each of the items of your list creating the map. Only in the worst case, the time is similar and always worse with the map creation due to the creation itself.
The map is interesting if you need, for example, all the items that match some condition, get a list instead found a ingle item. You spend time creating the map once, but you use the map many times and, in each time, you save time (map is "direct access" against the for that is "sequencial").

Generate unique list variable

I have a C# program where I have a list (List<string>) of unique strings. These strings represent the name of different cases. It is not important what is is. But they have to be unique.
cases = new List<string> { "case1", "case3", "case4" }
Sometimes I read some cases saved in a text format into my program. Sometime the a case stored on file have the same name as a case in my program.I have to rename this new case. Lets say that the name of the case I load from a file is case1.
But the trouble is. How to rename this without adding a large random string. In my case it should ideally be called case2, I do not find any good algorithm which can do that. I want to find the smalles number I can add which make it unique.

i would use a HashSet that only accepts unique values.
List<string> cases = new List<string>() { "case1", "case3", "case4" };
HashSet<string> hcases = new HashSet<string>(cases);
string Result = Enumerable.Range(1, 100).Select(x => "case" + x).First(x => hcases.Add(x));
// Result is "case2"
in this sample i try to add elements between 1 and 100 to the hashset and determine the first sucessfully Add()

If you have a list of unique strings consider to use a HashSet<string> instead. Since you want incrementing numbers that sounds as if you actually should use a custom class instead of a string. One that contains a name and a number property. Then you can increment the number and if you want the full name (or override ToString) use Name + Number.
Lets say that class is Case you could fill a HashSet<Case>. HashSet.Add returns false on duplicates. Then use a loop which increments the number until it could be added.
Something like this:
var cases = new HashSet<Case>();
// fill it ...
// later you want to add one from file:
while(!cases.Add(caseFromFile))
{
// you will get here if the set already contained one with this name+number
caseFromFile.Number++;
}
A possible implementation:
public class Case
{
public string Name { get; set; }
public int Number { get; set; }
// other properties
public override string ToString()
{
return Name + Number;
}
public override bool Equals(object obj)
{
Case other = obj as Case;
if (other == null) return false;
return other.ToString() == this.ToString();
}
public override int GetHashCode()
{
return (ToString() ?? "").GetHashCode();
}
// other methods
}

The solution is quite simple. Get the max number of case currently stored in the list, increment by one and add the new value:
var max = myList.Max(x => Convert.ToInt32(x.Substring("case".Length))) + 1;
myList.Add("case" + max);
Working fiddle.
EDIT: For filling any "holes" within your collection you may use this:
var tmp = myList;
var firstIndex = Convert.ToInt32(myList[0].Substring("case".Length));
for(int i = firstIndex; i < tmp.Count; i++) {
var curIndex = Convert.ToInt32(myList[i].Substring("case".Length));
if (curIndex != i)
{
myList.Add("case" + (curIndex + 1));
break;
}
}
It checks for every element in your list if its number behind the case is equal to its index in the list. The loop is stopped at the very first element where the condition is broken and therefor you have a hole in the list.

Linq OrderBy not sorting correctly 100% of the time

I'm using the Linq OrderBy() function to sort a generic list of Sitecore items by display name, then build a string of pipe-delimited guids, which is then inserted into a Sitecore field. The display name is a model number of a product, generally around 10 digits. At first it seemed like this worked 100% of the time, but the client found a problem with it...
This is one example that we have found so far. The code somehow thinks IC-30R-LH comes after IC-30RID-LH, but the opposite should be true.
I put this into an online alphabetizer like this one and it was able to get it right...
I did try adding StringComparer.InvariantCultureIgnoreCase as a second parameter to the OrderBy() but it did not help.
Here's the code... Let me know if you have any ideas. Note that I am not running this OrderBy() call inside of a loop, at any scope.
private string GetAlphabetizedGuidString(Item i, Field f)
{
List<Item> items = new List<Item>();
StringBuilder scGuidBuilder = new StringBuilder();
if (i != null && f != null)
{
foreach (ID guid in ((MultilistField)f).TargetIDs)
{
Item target = Sitecore.Data.Database.GetDatabase("master").Items.GetItem(guid);
if (target != null && !string.IsNullOrEmpty(target.DisplayName)) items.Add(target);
}
// Sort it by item name.
items = items.OrderBy(o => o.DisplayName, StringComparer.InvariantCultureIgnoreCase).ToList();
// Build a string of pipe-delimited guids.
foreach (Item item in items)
{
scGuidBuilder.Append(item.ID);
scGuidBuilder.Append("|");
}
// Return string which is a list of guids.
return scGuidBuilder.ToString().TrimEnd('|');
}
return string.Empty;
}

I was able to reproduce your problem with the following code:
var strings = new string[] { "IC-30RID-LH", "IC-30RID-RH", "IC-30R-LH", "IC-30R-RH"};
var sorted = strings.OrderBy(s => s);
I was also able to get the desired sort order by adding a comparer to the sort.
var sorted = strings.OrderBy(s => s, StringComparer.OrdinalIgnoreCase);
That forces a character-by-character (technically byte-by-byte) comparison of the two strings, which puts the '-' (45) before the 'I' (73).

Serially assign values to OrderedDictionary in C#

I have two key-value pairs, and now I want to fill up the larger one with values from the smaller one in a serial manner.
OrderedDictionary pickersPool = new OrderedDictionary(); // Small
OrderedDictionary pickersToTicketMap = new OrderedDictionary(); // Big
pickersPool.Add("emp1", 44);
pickersPool.Add("emp2", 543);
Now I need to update pickersToTicketMap to look like this:
("100", 44);
("109", 543);
("13", 44);
("23", 543);
So basically I need the pickersPool value to cycle through the keys of the pickersToTicketMap dictionary.
I need pickerPool values to keep cycling pickersToTicketMap and updating its value serially.
The pickersToTicketMap orderedlist initially has a value of:
("100", "null");
("109", "null");
("13", "null");
("23", "null");
so I need for the values of PickerPool orderedDictionary to fill up those nulls in a repeated fashion.

It sounds like you should start with a List<string> (or possibly a List<int>, given that they all seem to be integers...) rather than populating your map with empty entries to start with. So something like:
List<string> tickets = new List<string> { "100", "109", "13", "23" };
Then you can populate your pickersToTicketMap as:
var pickers = pickersPool.Values;
var pickerIterator = pickers.GetEnumerator();
foreach (var ticket in tickets)
{
if (!pickerIterator.MoveNext())
{
// Start the next picker...
pickerIterator = pickers.GetEnumerator();
if (!pickerIterator.MoveNext())
{
throw new InvalidOperationException("No pickers available!");
}
}
ticketToPickerMap[ticket] = pickerIterator.Current;
}
Note that I've changed the name from pickersToTicketMap to ticketToPickerMap because that appears to be what you really mean - the key is the ticket, and the value is the picker.
Also note that I'm not disposing of the iterator from pickers. That's generally a bad idea, but in this case I'm assuming that the iterator returned by OrderedDictionary.Values.GetEnumerator() doesn't need disposal.

There may be what you are looking for:
using System.Linq;
...
int i = 0;
// Cast OrderedDictionary to IEnumarable<DictionaryEntry> to be able to use System.Linq
object[] keys = pickersToTicketMap.Cast<DictionaryEntry>().Select(x=>x.Key).ToArray();
IEnumerable<DictionaryEntry> pickersPoolEnumerable = pickersPool.Cast<DictionaryEntry>();
// iterate over all keys (sorted)
foreach (object key in keys)
{
// Set the value of key to element i % pickerPool.Count
// i % pickerPool.Count will return for Count = 2
// 0, 1, 0, 1, 0, ...
pickersToTicketMap[key] = pickersPoolEnumarable
.ElementAt(i % pickersPool.Count).Value;
i++;
}
PS: The ToArray() is required to have a separate copy of the keys, so you don't get a InvalidOperationException due to changing the element you are iterating over.

So you want to update the large dictionary's values with consecutive and repeating values from the possibly smaller one? I have two approaches in mind, one simpler:
You can repeat the smaller collection with Enumerable.Repeat. You have to calculate the count. Then you can use SelectMany to flatten it and ToList to create a collection. Then you can use a for loop to update the larger dictionary with the values in the list via an index:
IEnumerable<int> values = pickersPool.Values.Cast<int>();
if (pickersPool.Count < pickersToTicketMap.Count)
{
// Repeat this collection until it has the same size as the larger collection
values = Enumerable.Repeat( values,
pickersToTicketMap.Count / pickersPool.Count
+ pickersToTicketMap.Count % pickersPool.Count
)
.SelectMany(intColl => intColl);
}
List<int> valueList = values.ToList();
for (int i = 0; i < valueList.Count; i++)
pickersToTicketMap[i] = valueList[i];
I would prefer the above approach, because it's more readable than my second which uses an "infinite" sequence. This is the extension method:
public static IEnumerable<T> RepeatEndless<T>(this IEnumerable<T> sequence)
{
while (true)
foreach (var item in sequence)
yield return item;
}
Now you can use this code to update the larger dictionary's values:
var endlessPickersPool = pickersPool.Cast<DictionaryEntry>().RepeatEndless();
IEnumerator<DictionaryEntry> endlessEnumerator;
IEnumerator<string> ptmKeyEnumerator;
using ((endlessEnumerator = endlessPickersPool.GetEnumerator()) as IDisposable)
using ((ptmKeyEnumerator = pickersToTicketMap.Keys.Cast<string>().ToList().GetEnumerator()) as IDisposable)
{
while (endlessEnumerator.MoveNext() && ptmKeyEnumerator.MoveNext())
{
DictionaryEntry pickersPoolItem = (DictionaryEntry)endlessEnumerator.Current;
pickersToTicketMap[ptmKeyEnumerator.Current] = pickersPoolItem.Value;
}
}
Note that it's important that I use largerDict.Keys.Cast<string>().ToList(), because I can't use the original Keys collection. You get an exception if you change it during enumeration.

Thanks to #jon skeet, although he modified my objects too much while trying to provide a hack for this.
After looking at your solution, I implemented the following, which works well for all my objects.
var pickerIterator = pickerPool.GetEnumerator();
foreach (DictionaryEntry ticket in tickets)
{
if (!pickerIterator.MoveNext())
{
// Start the next picker...
pickerIterator = pickerPool.GetEnumerator();
if (!pickerIterator.MoveNext())
{
throw new InvalidOperationException("No pickers available!");
}
}
ticketToPickerMap[ticket.Key] = pickerIterator.Value.ToString();
}

Simple(?) logic concerning HashSet

I have a HashSet filled with about 50 posts which I want to pair in two by two into my database (the posts are a title and a description that belong together). The problem is that I cant get the logic together. This code below maybe explains a little better what I am thinking of:
foreach(string item in hash)
{
// Here something that assigns every uneven HashSet-post to item1, the even ones to item2
var NewsItem = new News
{
NewsTitle = item1
NewsDescription = item2
};
dbContext db = new dbContext();
db.News.Add(NewsItem);
db.SaveChanges();
}

You cannot "pair up" items from hash-based containers, because from the logical standpoint these containers are ordered arbitrarily *.
Therefore, you need to pair up the titles and descriptions when you insert your data into hash sets, like this:
class Message {
public string Title {get;set;}
public string Description {get;set;}
public int GetHashCode() {return 31*Title.GetHashCode()+Description.GetHashCode();}
public bool Equals(object other) {
if (other == this) return true;
Message obj = other as Message;
if (obj == null) return false;
return Title.Equals(obj.Title) && Description.Equals(obj.Description);
}
}
ISet<Message> hash = new HashSet<Message>();
At this point you can insert messages into your hash set. The titles and descriptions will be always paired up explicitly by participating in a single Message object.
* The current implementation from Microsoft does maintain the insertion order, but this is an unfortunate implementation detail.

I define the first item in the HashSet is odd(1), and the second even(2), etc.
Then a HashSet is not the right data structure. HastSets are not in any particular order, so if you need to extract the items sequentially then a plain List<string> would work.
That said, one way to do what you need is to use a for loop that gets items two-at-a-
time:
using(dbContext db = new dbContext())
{
for(int i = 0; i < list.Count - 1; i += 2)
{
var NewsItem = new News
{
NewsTitle = list[i];
NewsDescription = list[i+1];
};
db.News.Add(NewsItem);
}
}
db.SaveChanges();

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

am I using Dictionary wrong, it seems it too slow - c#

Try using an IEqualityComparer as shown in the sample code on this page: http://msdn.microsoft.com/en-us/library/ms132151.aspx and make it calculate the hash code based on the title and color.

Replace your price_cache.Keys.Any() with price_cache.Keys.SingleOrDefault() and this way you can store the result in a variable, check for nullity and if not you already have the searched item instead of searching for it twice like you do here.

Related

If it possible accelerate From-Where-Select method?

Generate unique list variable

Linq OrderBy not sorting correctly 100% of the time

Serially assign values to OrderedDictionary in C#

Simple(?) logic concerning HashSet

Categories

Resources