Generate unique list variable - c#

I have a C# program where I have a list (List<string>) of unique strings. These strings represent the name of different cases. It is not important what is is. But they have to be unique.
cases = new List<string> { "case1", "case3", "case4" }
Sometimes I read some cases saved in a text format into my program. Sometime the a case stored on file have the same name as a case in my program.I have to rename this new case. Lets say that the name of the case I load from a file is case1.
But the trouble is. How to rename this without adding a large random string. In my case it should ideally be called case2, I do not find any good algorithm which can do that. I want to find the smalles number I can add which make it unique.

i would use a HashSet that only accepts unique values.
List<string> cases = new List<string>() { "case1", "case3", "case4" };
HashSet<string> hcases = new HashSet<string>(cases);
string Result = Enumerable.Range(1, 100).Select(x => "case" + x).First(x => hcases.Add(x));
// Result is "case2"
in this sample i try to add elements between 1 and 100 to the hashset and determine the first sucessfully Add()

If you have a list of unique strings consider to use a HashSet<string> instead. Since you want incrementing numbers that sounds as if you actually should use a custom class instead of a string. One that contains a name and a number property. Then you can increment the number and if you want the full name (or override ToString) use Name + Number.
Lets say that class is Case you could fill a HashSet<Case>. HashSet.Add returns false on duplicates. Then use a loop which increments the number until it could be added.
Something like this:
var cases = new HashSet<Case>();
// fill it ...
// later you want to add one from file:
while(!cases.Add(caseFromFile))
{
// you will get here if the set already contained one with this name+number
caseFromFile.Number++;
}
A possible implementation:
public class Case
{
public string Name { get; set; }
public int Number { get; set; }
// other properties
public override string ToString()
{
return Name + Number;
}
public override bool Equals(object obj)
{
Case other = obj as Case;
if (other == null) return false;
return other.ToString() == this.ToString();
}
public override int GetHashCode()
{
return (ToString() ?? "").GetHashCode();
}
// other methods
}

The solution is quite simple. Get the max number of case currently stored in the list, increment by one and add the new value:
var max = myList.Max(x => Convert.ToInt32(x.Substring("case".Length))) + 1;
myList.Add("case" + max);
Working fiddle.
EDIT: For filling any "holes" within your collection you may use this:
var tmp = myList;
var firstIndex = Convert.ToInt32(myList[0].Substring("case".Length));
for(int i = firstIndex; i < tmp.Count; i++) {
var curIndex = Convert.ToInt32(myList[i].Substring("case".Length));
if (curIndex != i)
{
myList.Add("case" + (curIndex + 1));
break;
}
}
It checks for every element in your list if its number behind the case is equal to its index in the list. The loop is stopped at the very first element where the condition is broken and therefor you have a hole in the list.

Related

If it possible accelerate From-Where-Select method?

I have two var of code:
first:
struct pair_fiodat {string fio; string dat;}
List<pair_fiodat> list_fiodat = new List<pair_fiodat>();
// list filled 200.000 records, omitted.
foreach(string fname in XML_files)
{
// get FullName and Birthday from file. Omitted.
var usersLookUp = list_fiodat.ToLookup(u => u.fio, u => u.dat); // create map
var dates = usersLookUp[FullName];
if (dates.Count() > 0)
{
foreach (var dt in dates)
{
if (dt == BirthDate) return true;
}
}
}
and second:
struct pair_fiodat {string fio; string dat;}
List<pair_fiodat> list_fiodat = new List<pair_fiodat>();
// list filled 200.000 records, omitted.
foreach(string fname in XML_files)
{
// get FullName and Birthday from file. Omitted.
var members = from s in list_fiodat where s.fio == FullName & s.dat == Birthdate select s;
if (members.Count() > 0 return true;
}
They make the same job - searching user by name and birthday.
The first one work very quick.
The second is very slowly (10x-50x)
Tell me please if it possible accelerate the second one?
I mean may be the list need in special preparing?
I tried sorting: list_fiodat_sorted = list_fiodat.OrderBy(x => x.fio).ToList();, but...
I skip your first test and change Count() to Any() (count iterate all list while any stop when there are an element)
public bool Test1(List<pair_fiodat> list_fiodat)
{
foreach (string fname in XML_files)
{
var members = from s in list_fiodat
where s.fio == fname & s.dat == BirthDate
select s;
if (members.Any())
return true;
}
return false;
}
If you want optimize something, you must leave comfortable things that offer the language to you because usually this things are not free, they have a cost.
For example, for is faster than foreach. Is a bit more ugly, you need two sentences to get the variable, but is faster. If you iterate a very big collection, each iteration sum.
LINQ is very powerfull and it's wonder work with it, but has a cost. If you change it for another "for", you save time.
public bool Test2(List<pair_fiodat> list_fiodat)
{
for (int i = 0; i < XML_files.Count; i++)
{
string fname = XML_files[i];
for (int j = 0; j < list_fiodat.Count; j++)
{
var s = list_fiodat[j];
if (s.fio == fname & s.dat == BirthDate)
{
return true;
}
}
}
return false;
}
With normal collections there aren't difference and usually you use foeach, LINQ... but in extreme cases, you must go to low level.
In your first test, ToLookup is the key. It takes a long time. Think about this: you are iterating all your list, creating and filling the map. It's bad in any case but think about the case in which the item you are looking for is at the start of the list: you only need a few iterations to found it but you spend time in each of the items of your list creating the map. Only in the worst case, the time is similar and always worse with the map creation due to the creation itself.
The map is interesting if you need, for example, all the items that match some condition, get a list instead found a ingle item. You spend time creating the map once, but you use the map many times and, in each time, you save time (map is "direct access" against the for that is "sequencial").

Does expanding arraylist of objects make a new object?

Assume we have an array list of type Employe , does expanding it's length by 1 make a new object in the list ?
is the code in else statement correct? and is it recommended?
public void ModifierEmp(int c)
{
for(int i = 0; i < Ann.Count; i++)
{
if(Ann[i].Code == c)
{
Ann[i].saisie();
} else
{
i = Ann.Count + 1; //expanding arraylist ann
Ann[i].saisie(); //saisie a method for the user to input Employe infos
}
}
}
https://imgur.com/VfFHDKu "code snippet"
i = Ann.Count + 1;
The code above is not expanding the list: it is only setting your index variable (i) to have a new value.
If you wanted to make the list bigger, you would have to tell it which object to put into that new space you create. For example:
Ann.Add(anotherItem);
Of course, this gives you the ability to decide whether to add an existing item, create a new item (e.g. Ann.Add(new Something() { Code = c })), or even add a null value to the list (which is not usually a good idea).

Classes with virtually common code

I have a number of custom collection classes. Each serves to provide a collection of various custom types - one custom type to one custom collection. The custom collections inherit List<T> [where T in this case is the specific custom type, rather then a generic] and provide some additional functionality.
I previously did away with the custom collections and had custom methods elsewhere, but I found as I extended the code that I needed the collections with their own methods.
It all works, everything is happy. But it irritates me, because I know I am not doing it properly. The issue is that each class uses pretty much the same code, varying only the type and a parameter, so I feel that it could be implemented as an abstract class, or generic, or extension to List, or ... but I'm not really understanding enough of the differences or how to go about it to be able to sort out what I need.
Here are two of my several collections, so that you get the idea:
// JourneyPatterns
public class JourneyPatterns : List<JourneyPattern>
{
private Dictionary<string, JourneyPattern> jpHashes; // This is a hash table for quick lookup of a JP based on its values
/* Add a journey pattern to the JourneyPatterns collection. Three methods for adding:
1. "Insert Before" (=at) a particular point in the list. This is the method used by all three methods.
2. "Insert After" a particular point in the list. This is "before" shifted by 1 e.g. "after 6" is "before 7"
3. "Append" to the end of the list. This is "before" with a value equal to the list count, and is the same as inherited "Add", but with checks
*/
public JourneyPattern InsertBefore(JourneyPattern JP, int before)
{
// check for a pre-existing JP with the same parameters (ignore ID). Do this by constructing a "key" based on the values to check against
// and looking it up in the private hash dictionary
JourneyPattern existingJP;
if (jpHashes.TryGetValue(JP.hash, out existingJP)) { return existingJP; }
else
{
// construct a new ID for this JP
if (string.IsNullOrWhiteSpace(JP.id)) JP.id = "JP_" + (Count + 1).ToString();
// next check that the ID specified isn't already being used by a different JPS
if (Exists(a => a.id == JP.id)) JP.id = "JP_" + (Count + 1).ToString();
// now do the add/insert
if (before < 0) { Insert(0, JP); } else if (before >= Count) { Add(JP); } else { Insert(before, JP); }
// finally add to the hash table for fast compare / lookup
jpHashes.Add(JP.hash, JP);
return JP;
}
}
public JourneyPattern InsertAfter(JourneyPattern JP, int after) { return InsertBefore(JP, after + 1); }
public JourneyPattern Append(JourneyPattern JP) { return InsertBefore(JP, Count); }
}
// JourneyPatternSections
public class JourneyPatternSections : List<JourneyPatternSection>
{
private Dictionary<string, JourneyPatternSection> jpsHashes; // This is a hash table for quick lookup of a JPS based on its values
/* Add a journey pattern section to the journeyPatternSections collection. Three methods for adding:
1. "Insert Before" (=at) a particular point in the list. This is the method used by all three methods.
2. "Insert After" a particular point in the list. This is "before" shifted by 1 e.g. "after 6" is "before 7"
3. "Append" to the end of the list. This is "before" with a value equal to the list count, and is the same as inherited "Add", but with checks
*/
public JourneyPatternSection InsertBefore(JourneyPatternSection JPS, int before)
{
// check for a pre-existing JPS with the same parameters (ignore ID). Do this by constructing a "key" based on the values to check against
// and looking it up in the private hash dictionary
JourneyPatternSection existingJPS;
if (jpsHashes.TryGetValue(JPS.hash, out existingJPS)) { return existingJPS; }
else
{
// construct a new ID for this JPS
if (string.IsNullOrWhiteSpace(JPS.id)) JPS.id = "JPS_" + (Count + 1).ToString();
// next check that the ID specified isn't already being used by a different JPS
if (Exists(a => a.id == JPS.id)) JPS.id = "JPS_" + (Count + 1).ToString();
// now do the add/insert
if (before < 0) { Insert(0, JPS); } else if (before >= Count) { Add(JPS); } else { Insert(before, JPS); }
// finally add to the hash table for fast compare / lookup
jpsHashes.Add(JPS.hash, JPS);
return JPS;
}
}
public JourneyPatternSection InsertAfter(JourneyPatternSection JPS, int after) { return InsertBefore(JPS, after + 1); }
public JourneyPatternSection Append(JourneyPatternSection JPS) { return InsertBefore(JPS, Count); }
}
As you can see, what is differing is the type (JourneyPattern, or JourneyPatternSection), and the prefix that I am using for the "id" property of the type ("JP_" or "JPS_"). Everything else is common, since the method of determining "uniqueness" (the property "hash") is part of the custom type.
Some of my custom collections require more involved and different implementations of these methods, which is fine, but this is the most common one and I have implemented it about 6 times so far which seems a) pointless, and b) harder to maintain.
Your thoughts and help appreciated!
Assming tha both JourneyPattern and JourneyPatternSection implements a common interface like:
public interface IJourney
{
string hash { get; set; }
string id { get; set; }
}
You can implements a base class for your collections:
public abstract class SpecializedList<T> : List<T> where T : class, IJourney
{
private Dictionary<string, T> jpHashes; // This is a hash table for quick lookup of a JP based on its values
protected abstract string IdPrefix { get; }
/* Add a journey pattern to the JourneyPatterns collection. Three methods for adding:
1. "Insert Before" (=at) a particular point in the list. This is the method used by all three methods.
2. "Insert After" a particular point in the list. This is "before" shifted by 1 e.g. "after 6" is "before 7"
3. "Append" to the end of the list. This is "before" with a value equal to the list count, and is the same as inherited "Add", but with checks
*/
public T InsertBefore(T JP, int before)
{
// check for a pre-existing JP with the same parameters (ignore ID). Do this by constructing a "key" based on the values to check against
// and looking it up in the private hash dictionary
T existingJP;
if (jpHashes.TryGetValue(JP.hash, out existingJP)) { return existingJP; }
else
{
// construct a new ID for this JP
if (string.IsNullOrWhiteSpace(JP.id)) JP.id = "JP_" + (Count + 1).ToString();
// next check that the ID specified isn't already being used by a different JPS
if (Exists(a => a.id == JP.id)) JP.id = IdPrefix + (Count + 1).ToString();
// now do the add/insert
if (before < 0) { Insert(0, JP); } else if (before >= Count) { Add(JP); } else { Insert(before, JP); }
// finally add to the hash table for fast compare / lookup
jpHashes.Add(JP.hash, JP);
return JP;
}
}
public T InsertAfter(T JP, int after) { return InsertBefore(JP, after + 1); }
public T Append(T JP) { return InsertBefore(JP, Count); }
}
Then implement each collection:
public class JourneyPatterns : SpecializedList<JourneyPattern>
{
protected override string IdPrefix => "JP_";
}
public class JourneyPatternSections : SpecializedList<JourneyPatternSection>
{
protected override string IdPrefix => "JPS_";
}

Simple(?) logic concerning HashSet

I have a HashSet filled with about 50 posts which I want to pair in two by two into my database (the posts are a title and a description that belong together). The problem is that I cant get the logic together. This code below maybe explains a little better what I am thinking of:
foreach(string item in hash)
{
// Here something that assigns every uneven HashSet-post to item1, the even ones to item2
var NewsItem = new News
{
NewsTitle = item1
NewsDescription = item2
};
dbContext db = new dbContext();
db.News.Add(NewsItem);
db.SaveChanges();
}
You cannot "pair up" items from hash-based containers, because from the logical standpoint these containers are ordered arbitrarily *.
Therefore, you need to pair up the titles and descriptions when you insert your data into hash sets, like this:
class Message {
public string Title {get;set;}
public string Description {get;set;}
public int GetHashCode() {return 31*Title.GetHashCode()+Description.GetHashCode();}
public bool Equals(object other) {
if (other == this) return true;
Message obj = other as Message;
if (obj == null) return false;
return Title.Equals(obj.Title) && Description.Equals(obj.Description);
}
}
ISet<Message> hash = new HashSet<Message>();
At this point you can insert messages into your hash set. The titles and descriptions will be always paired up explicitly by participating in a single Message object.
* The current implementation from Microsoft does maintain the insertion order, but this is an unfortunate implementation detail.
I define the first item in the HashSet is odd(1), and the second even(2), etc.
Then a HashSet is not the right data structure. HastSets are not in any particular order, so if you need to extract the items sequentially then a plain List<string> would work.
That said, one way to do what you need is to use a for loop that gets items two-at-a-
time:
using(dbContext db = new dbContext())
{
for(int i = 0; i < list.Count - 1; i += 2)
{
var NewsItem = new News
{
NewsTitle = list[i];
NewsDescription = list[i+1];
};
db.News.Add(NewsItem);
}
}
db.SaveChanges();

am I using Dictionary wrong, it seems it too slow

I've used VS profilier and noticed that ~40% of the time program spends in the lines below.
I'm using title1 and color1 because either Visual Studio or Resharper suggested to do so. Are there any perfomance issues in the code below?
Dictionary<Item, int> price_cache = new Dictionary<Item, int>();
....
string title1 = title;
string color1 = color;
if (price_cache.Keys.Any(item => item.Title == title && item.Color == color))
{
price = price_cache[price_cache.Keys.First(item => item.Title == title11 && item.Color == color1)];
The problem is that your Keys.Any method iterates through all keys in your dictionary to find if there is a match. After that, you use the First method to do the same thing again.
Dictionary is suited for operations when you already have the key and want to get the value fast. In that case, it will calculate the hash code of your key (Item, in your case) and use it to "jump" to the bucket where your item is stored.
First, you need to make your custom comparer to let the Dictionary know how to compare items.
class TitleColorEqualityComparer : IEqualityComparer<Item>
{
public bool Equals(Item a, Item b)
{
// you might also check for nulls here
return a.Title == b.Title &&
a.Color == b.Color;
}
public int GetHashCode(Item obj)
{
// this should be as much unique as possible,
// but not too complicated to calculate
int hash = 17;
hash = hash * 31 + obj.Title.GetHashCode();
hash = hash * 31 + obj.Color.GetHashCode();
return hash;
}
}
Then, instantiate your dictionary using your custom comparer:
Dictionary<Item, int> price_cache =
new Dictionary<Item, int>(new TitleColorEqualityComparer());
From this point on, you can simply write:
Item some_item = GetSomeItem();
price_cache[some_item] = 5; // to quickly set or change a value
or, to search the dictionary:
Item item = GetSomeItem();
int price = 0;
if (price_cache.TryGetValue(item, out price))
{
// we got the price
}
else
{
// there is no such key in the dictionary
}
[Edit]
And to emphasize again: never iterate the Keys property to look for a key. If you do that, you don't need a Dictionary at all, you can simply use a list and get same (even slightly better performance).
Try using an IEqualityComparer as shown in the sample code on this page: http://msdn.microsoft.com/en-us/library/ms132151.aspx and make it calculate the hash code based on the title and color.
As Jesus Ramos suggested (when he said use a different data structure), you could make the key a string that is a concatenation of the title and color, then concatenate the search string and look for that. It should be faster.
So a key could look like name1:FFFFFF (the name, a colon, then the hex of the color), then you would just format the search string the same way.
Replace your price_cache.Keys.Any() with price_cache.Keys.SingleOrDefault() and this way you can store the result in a variable, check for nullity and if not you already have the searched item instead of searching for it twice like you do here.
If you want fast access to your hashtable, you need to implement the GetHashCode and Equals functioning:
public class Item
{
.....
public override int GetHashCode()
{
return (this.color.GetHashCode() + this.title.GetHashCode())/2;
}
public override bool Equals(object o)
{
if (this == o) return true;
var item = o as Item;
return (item != null) && (item.color == color) && (item.title== title) ;
}
Access you dictionary like:
Item item = ...// create sample item
int price = 0;
price_cache.ContainsKey(item);
price_cache[item];
price_cache.TryGetValue(item, out price);

Categories