I have two methods in an IntExtensions class to help generate the next available incremental value (which is not in a list of existing integers which need to be excluded).
I dont think I'm addressing the NextIncrementalValueNotInList method in the best way and am wondering if I can better use linq to return the next available int?
public static bool IsInList(this int value, List<int> ListOfIntegers) {
if (ListOfIntegers.Contains(value))
return true;
return false;
}
public static int NextIncrementalValueNotInList(this int value,
List<int> ListOfIntegers) {
int maxResult;
maxResult = ListOfIntegers.Max() + 1;
for (int i = value; i <= maxResult; i++)
{
if (!(i.IsInList(ListOfIntegers)))
{
return i;
}
}
return maxResult;
}
Using linq your method will look like:
return IEnumerable.Range(1, ListOfIntegers.Count + 1)
.Except(ListOfIntegers)
.First();
I guess it starting at 1.
You could also proceed like this:
IEnumerable.Range(1, ListOfIntegers.Count)
.Where(i => !ListOfIntegers.Contains(i))
.Union(new []{ ListOfIntegers.Count + 1 })
.First();
You don't actually need to calculate the Max value - just keep incrementing i until you find a value that doesn't exist in the list, e.g:
public static int NextIncrementalValueNotInList(this int value,
List<int> ListOfIntegers)
{
int i = value;
while(true)
{
if (!(i.IsInList(ListOfIntegers)))
{
return i;
}
i++;
}
return maxResult;
}
. Besides that, I'm not sure if there's much more you can do about this unless:
ListOfIntegers is guaranteed to be, or needs to be, sorted, or
ListOfIntegers doesn't actually need to be a List<int>
If the answer to the first is no, and to the second is yes, then you might instead use a HashSet<int>, which might provide a faster implementation by allowing you to simply use HashSet<T>'s own bool Contains(T) method:
public static int NextIncrementalValueNotInList(this int value,
HashSet<int> ListOfIntegers)
{
int i = value;
while(true)
{
if (!(ListOfIntegers.Contains(i))
{
return value;
}
i++;
}
}
Note that this version shows how to do away with the Max check also.
Although be careful of premature optimisation - if your current implementation is fast enough, then I wouldn't worry. You should properly benchmark any alternative solution with extreme cases as well as real-world cases to see if there's actually any difference.
Also what you don't want to do is use my suggestion above by turning your list into a HashSet for every call. I'm suggesting changing entirely your use of List to HashSet - any piecemeal conversion per-call will negate any potential performance benefits due to the overhead of creating the HashSet.
Finally, if you're not actually expecting much fragmentation in your integer list, then it's possible that a HashSet might not be much different from the current Linq version, because it's possibly going to end up doing similar amounts of work anyway.
Related
Any easier way to write this if statement?
if (value==1 || value==2)
For example... in SQL you can say where value in (1,2) instead of where value=1 or value=2.
I'm looking for something that would work with any basic type... string, int, etc.
How about:
if (new[] {1, 2}.Contains(value))
It's a hack though :)
Or if you don't mind creating your own extension method, you can create the following:
public static bool In<T>(this T obj, params T[] args)
{
return args.Contains(obj);
}
And you can use it like this:
if (1.In(1, 2))
:)
A more complicated way :) that emulates SQL's 'IN':
public static class Ext {
public static bool In<T>(this T t,params T[] values){
foreach (T value in values) {
if (t.Equals(value)) {
return true;
}
}
return false;
}
}
if (value.In(1,2)) {
// ...
}
But go for the standard way, it's more readable.
EDIT: a better solution, according to #Kobi's suggestion:
public static class Ext {
public static bool In<T>(this T t,params T[] values){
return values.Contains(t);
}
}
C# 9 supports this directly:
if (value is 1 or 2)
however, in many cases: switch might be clearer (especially with more recent switch syntax enhancements). You can see this here, with the if (value is 1 or 2) getting compiled identically to if (value == 1 || value == 2).
Is this what you are looking for ?
if (new int[] { 1, 2, 3, 4, 5 }.Contains(value))
If you have a List, you can use .Contains(yourObject), if you're just looking for it existing (like a where). Otherwise look at Linq .Any() extension method.
Using Linq,
if(new int[] {1, 2}.Contains(value))
But I'd have to think that your original if is faster.
Alternatively, and this would give you more flexibility if testing for values other than 1 or 2 in future, is to use a switch statement
switch(value)
{
case 1:
case 2:
return true;
default:
return false
}
If you search a value in a fixed list of values many times in a long list, HashSet<T> should be used. If the list is very short (< ~20 items), List could have better performance, based on this test
HashSet vs. List performance
HashSet<int> nums = new HashSet<int> { 1, 2, 3, 4, 5 };
// ....
if (nums.Contains(value))
Generally, no.
Yes, there are cases where the list is in an Array or List, but that's not the general case.
An extensionmethod like this would do it...
public static bool In<T>(this T item, params T[] items)
{
return items.Contains(item);
}
Use it like this:
Console.WriteLine(1.In(1,2,3));
Console.WriteLine("a".In("a", "b"));
You can use the switch statement with pattern matching (another version of jules's answer):
if (value switch{1 or 3 => true,_ => false}){
// do something
}
Easier is subjective, but maybe the switch statement would be easier? You don't have to repeat the variable, so more values can fit on the line, and a line with many comparisons is more legible than the counterpart using the if statement.
In vb.net or C# I would expect that the fastest general approach to compare a variable against any reasonable number of separately-named objects (as opposed to e.g. all the things in a collection) will be to simply compare each object against the comparand much as you have done. It is certainly possible to create an instance of a collection and see if it contains the object, and doing so may be more expressive than comparing the object against all items individually, but unless one uses a construct which the compiler can explicitly recognize, such code will almost certainly be much slower than simply doing the individual comparisons. I wouldn't worry about speed if the code will by its nature run at most a few hundred times per second, but I'd be wary of the code being repurposed to something that's run much more often than originally intended.
An alternative approach, if a variable is something like an enumeration type, is to choose power-of-two enumeration values to permit the use of bitmasks. If the enumeration type has 32 or fewer valid values (e.g. starting Harry=1, Ron=2, Hermione=4, Ginny=8, Neville=16) one could store them in an integer and check for multiple bits at once in a single operation ((if ((thisOne & (Harry | Ron | Neville | Beatrix)) != 0) /* Do something */. This will allow for fast code, but is limited to enumerations with a small number of values.
A somewhat more powerful approach, but one which must be used with care, is to use some bits of the value to indicate attributes of something, while other bits identify the item. For example, bit 30 could indicate that a character is male, bit 29 could indicate friend-of-Harry, etc. while the lower bits distinguish between characters. This approach would allow for adding characters who may or may not be friend-of-Harry, without requiring the code that checks for friend-of-Harry to change. One caveat with doing this is that one must distinguish between enumeration constants that are used to SET an enumeration value, and those used to TEST it. For example, to set a variable to indicate Harry, one might want to set it to 0x60000001, but to see if a variable IS Harry, one should bit-test it with 0x00000001.
One more approach, which may be useful if the total number of possible values is moderate (e.g. 16-16,000 or so) is to have an array of flags associated with each value. One could then code something like "if (((characterAttributes[theCharacter] & chracterAttribute.Male) != 0)". This approach will work best when the number of characters is fairly small. If array is too large, cache misses may slow down the code to the point that testing against a small number of characters individually would be faster.
Using Extension Methods:
public static class ObjectExtension
{
public static bool In(this object obj, params object[] objects)
{
if (objects == null || obj == null)
return false;
object found = objects.FirstOrDefault(o => o.GetType().Equals(obj.GetType()) && o.Equals(obj));
return (found != null);
}
}
Now you can do this:
string role= "Admin";
if (role.In("Admin", "Director"))
{
...
}
public static bool EqualsAny<T>(IEquatable<T> value, params T[] possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
public static bool EqualsAny<T>(IEquatable<T> value, IEnumerable<T> possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
I had the same problem but solved it with a switch statement
switch(a value you are switching on)
{
case 1:
the code you want to happen;
case 2:
the code you want to happen;
default:
return a value
}
In this situation where one member is edited to become equal to another, what is the proper way to force the HashSet to recalculate hashes and thereby purge itself of duplicates?
I knew better than to expect this to happen automatically, so I tried such things as intersecting the HashSet with itself, then reassigning it to a constructor call which refers to itself and the same EqualityComparer. I thought for sure the latter would work, but no.
One thing which does succeed is reconstructing the HashSet from its conversion to some other container type such as List, rather than directly from itself.
Class defs:
public class Test {
public int N;
public override string ToString() { return this.N.ToString(); }
}
public class TestClassEquality: IEqualityComparer<Test> {
public bool Equals(Test x, Test y) { return x.N == y.N; }
public int GetHashCode(Test obj) { return obj.N.GetHashCode(); }
}
Test code:
TestClassEquality eq = new TestClassEquality();
HashSet<Test> hs = new HashSet<Test>(eq);
Test a = new Test { N = 1 }, b = new Test { N = 2 };
hs.Add(a);
hs.Add(b);
b.N = 1;
string fmt = "Count = {0}; Values = {1}";
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs.IntersectWith(hs);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs = new HashSet<Test>(hs, eq);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs = new HashSet<Test>(new List<Test>(hs), eq);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
Output:
"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 1; Values: 1"
Based on the final approach succeeding, I could probably create an extension method in which the HashSet dumps itself into a local List, clears itself, and then repopulates from said list.
Is that really necessary or is there some simpler way to do this?
Lasse's comment is correct: you are required by the contract of HashSet to not do this, so asking what to do when you do this is a non-starter. If it hurts when you do that, stop doing that. A mutable object must not be put into a hash set if a mutation will cause its hash value to change while it is in the set. You're in a cleft stick of your own making.
To get out of that cleft stick, you could:
Stop mutating the objects while they are in a hash set. Remove them before you mutate them, put them back in later.
Fix the implementation of equality and hashing on the object so that it is consistent across mutations.
When you create the hash set, provide a custom hashing/equality algorithm that does not change its opinions when the object is mutated.
Implement your own "set" class that has whatever behaviour you like in this scenario. That is extremely difficult, so be careful. (There is a reason why this restriction was created in the first place!)
There is no other way than recreating the HashSet<>. Sadly the HashSet<> constructor has an optimization so that if it is create from another HashSet<> it copies the hash codes... So we can cheat:
hs = new HashSet<Test>(hs.Skip(0), eq);
The hs.Skip(0) is a IEnumerable<>, not an HashSet<>. This defeats the HashSet<> check.
Note that there is no guarantee that in the future the Skip() won't implement a shortcircuit in case of 0, something like:
if (count == 0)
{
return enu;
}
else
{
return count elements;
}
(see Lippert's comment, false problem)
The "manual" method to do it is:
var hs2 = new HashSet<Test>(eq);
foreach (var value in hs)
{
hs2.Add(value);
}
hs = hs2;
So enumerate "manually" and readd.
As you saw, HashSets don't deal with mutable objects when modifying the object affects its hash code or equality to other objects. Just remove it and re-add it:
hs.Remove(b);
b.N = 1;
hs.Add(b);
I'm using a recursive version of the insertion sort algorithm to sort 5000 objects based upon a randomly generated integer property, but I've been getting a stackoverflow exception only at an ArrayList of this size while working fine for ArrayLists of other sizes.
I used Console.WriteLine to see what the "position" integer goes up to in one of my methods and it ends up at `4719 before skipping a line and giving a stackoverflow exception. How should I get around this?
I should also mention that when testing an iterative version of insertion sort in the same Visual Studio solution and using an ArrayList of the same size of objects I do not get a stackoverflow exception.
My code for the recursive insertion sort is below (AL is the ArrayList):
public void IS()
{
ISRM(0);
}
private void ISRM(int position)
{
if (position == AL.Count)
return;
Console.WriteLine(position);
int PositionNext = position + 1;
ISRMNext(position, PositionNext);
ISRM(position + 1);
}
private void ISRMNext(int position, int PositionNext)
{
if ((PositionNext == 0) || (PositionNext == AL.Count))
return;
Webpage EntryNext = (Webpage)AL[PositionNext];
Webpage EntryBefore = (Webpage)AL[PositionNext - 1];
if (EntryBefore.getVisitCount() < EntryNext.getVisitCount())
{
Webpage temp = EntryBefore;
AL[PositionNext - 1] = AL[PositionNext];
AL[PositionNext] = temp;
}
ISRMNext(position, PositionNext - 1);
}
Well, first of all, sorting through recursive call is a bad idea for several reasons.
As you've already found out, this easily leads to a stack overflow due to limited size of the stack.
It will have poor performance by definition since function call and accompanying allocation of local function context on the stack is much more expensive operation compared to something like while or for operators iterating through plain collection.
These are two reasons why #Zer0 probably suggested it, but there's more to it.
There's ready ArrayList.Sort() method waiting for you that takes custom comparator. All you need is to write said comparator for your custom objects according to whatever rules you want and call Sort(your_comparator). That's it. You do not need to re-invent the wheel implementing your own sorting method itself - unless implementing sorting method is the actual goal of your program... but I honestly doubt it.
So, It could be something like this (not tested!):
class MyComparer : IComparer
{
public int Compare(object x, object y)
{
var _x = ((Webpage) x).getVisitCount();
var _y = ((Webpage) y).getVisitCount();
if (_x < _y)
{
return -1;
}
if (_x > _y)
{
return 1;
}
return 0;
}
}
Usage:
var myAL = new ArrayList();
// ... filling up the myAL
myAL.Sort(new MyComparer());
[TestFixture]
class HashSetExample
{
[Test]
public void eg()
{
var comparer = new OddEvenBag();
var hs = new HashSet<int>(comparer);
hs.Add(1);
Assert.IsTrue(hs.Contains(3));
Assert.IsFalse(hs.Contains(0));
// THIS LINE HERE
var containedValue = hs.First(x => comparer.Equals(x, 3)); // i want something faster than this
Assert.AreEqual(1, containedValue);
}
public class OddEvenBag : IEqualityComparer<int>
{
public bool Equals(int x, int y)
{
return x % 2 == y % 2;
}
public int GetHashCode(int obj)
{
return obj % 2;
}
}
}
As well as checking if hs contains an odd number, I want to know what odd number if contains. Obviously I want a method that scales reasonably and does not simply iterate-and-search over the entire collection.
Another way to rephrase the question is, I want to replace the line below THIS LINE HERE with something efficient (say O(1), instead of O(n)).
Towards what end? I'm trying to intern a laaaaaaaarge number of immutable reference objects similar in size to a Point3D. Seems like using a HashSet<Foo> instead of a Dictionary<Foo,Foo> saves about 10% in memory. No, obviously this isn't a game changer but I figured it would not hurt to try it for a quick win. Apologies if this has offended anybody.
Edit: Link to similar/identical post provided by Balazs Tihanyi in comments, put here for emphasis.
The simple answer is no, you can't.
If you want to retrieve the object you will need to use a HashSet. There just isn't any suitable method in the API to do what you are asking for otherwise.
One optimization you could make though if you must use a Set for this is to first do a contains check and then only iterate over the Set if the contains returns true. Still you would almost certainly find that the extra overhead for a HashMap is tiny (since essentially it's just another object reference).
What is the best or easier container I could use to retrieve the the last entry position ?
Or there are not better or easier than using Count ? is it ok to rely on count ?
Example:
List<Class> myList = new List<Class>();
int lastEntry = myList.Count - 1;
Message.Box(myList[lastEntry].Name);
There is no concurrent write to this list mainly reading.
Using Count is fine for List<T> -- or anything else that implements ICollection<T> or ICollection -- but you have an off-by-one error in your code. It should be...
int lastEntry = myList.Count - 1; // index is zero-based
Count is going to be the most performant, though since list indexing is zero-based you'll want to use count - 1 to retrieve the last entry in the list.
If you really want you can use Linq and do something like:
myList.Last()
or, if your worried about empty lists
myList.LastOrDefault()
But that is going to most likely be slower (depending on how Last() is implemented)
You could take advantage of the Last() extension method like so:
Message.Box(myList.Last().Name);
You can also use Last, which can help you avoid erros like the one you made.
On a side-note: Last is optimised for IList implementations to use exactly the same method as you did: access with index. Sure it is probably slower than doing it manually (optimisation requires additional cast), but unless it really is a bottleneck I wouldn't worry too much.
If you're interested to investigate this topic deeper, here's part of Jon Skeet's excellent series: Reimplementing LINQ to Objects: Part 11 - First/Single/Last and the ...OrDefault versions
If you just need to access the last item in the list, you might be better off using a Stack<T> instead. For the code you've written, there's nothing wrong with using Count - bear in mind that you should use .Count - 1
Use a Stack:
Stack<Class> d = new Stack<Class>();
Class last = d.Pop();
Message.Box(last.Name);
or if you don't want to remove:
Class last = d.Peek();
Message.Box(last.Name);
I want to make one point that seems to have been glossed over. Lists are not Queues; you don't always add to the end. You can instead Insert to them. If you want the index of the last-inserted item, you have to get a little more creative:
public class SmartList<T>:List<T>
{
public int LastIndex {get; protected set;}
public new virtual void Add(T obj)
{
base.Add(obj);
lastIndex = Count - 1;
}
public new virtual void AddRange(IEnumerable<T> obj)
{
base.AddRange(obj);
lastIndex = Count - 1;
}
public new virtual void Insert(T obj, int index)
{
base.Insert(obj, index);
lastIndex = index;
}
}
Unfortunately List's methods are not virtual, so you have to hide them and thus you have to use this class as the concrete SmartList; you can't use it as the value of a List-typed variable or parameter.