Get the last entry position easily? - c#

What is the best or easier container I could use to retrieve the the last entry position ?
Or there are not better or easier than using Count ? is it ok to rely on count ?
Example:
List<Class> myList = new List<Class>();
int lastEntry = myList.Count - 1;
Message.Box(myList[lastEntry].Name);
There is no concurrent write to this list mainly reading.

Using Count is fine for List<T> -- or anything else that implements ICollection<T> or ICollection -- but you have an off-by-one error in your code. It should be...
int lastEntry = myList.Count - 1; // index is zero-based

Count is going to be the most performant, though since list indexing is zero-based you'll want to use count - 1 to retrieve the last entry in the list.
If you really want you can use Linq and do something like:
myList.Last()
or, if your worried about empty lists
myList.LastOrDefault()
But that is going to most likely be slower (depending on how Last() is implemented)

You could take advantage of the Last() extension method like so:
Message.Box(myList.Last().Name);

You can also use Last, which can help you avoid erros like the one you made.
On a side-note: Last is optimised for IList implementations to use exactly the same method as you did: access with index. Sure it is probably slower than doing it manually (optimisation requires additional cast), but unless it really is a bottleneck I wouldn't worry too much.
If you're interested to investigate this topic deeper, here's part of Jon Skeet's excellent series: Reimplementing LINQ to Objects: Part 11 - First/Single/Last and the ...OrDefault versions

If you just need to access the last item in the list, you might be better off using a Stack<T> instead. For the code you've written, there's nothing wrong with using Count - bear in mind that you should use .Count - 1

Use a Stack:
Stack<Class> d = new Stack<Class>();
Class last = d.Pop();
Message.Box(last.Name);
or if you don't want to remove:
Class last = d.Peek();
Message.Box(last.Name);

I want to make one point that seems to have been glossed over. Lists are not Queues; you don't always add to the end. You can instead Insert to them. If you want the index of the last-inserted item, you have to get a little more creative:
public class SmartList<T>:List<T>
{
public int LastIndex {get; protected set;}
public new virtual void Add(T obj)
{
base.Add(obj);
lastIndex = Count - 1;
}
public new virtual void AddRange(IEnumerable<T> obj)
{
base.AddRange(obj);
lastIndex = Count - 1;
}
public new virtual void Insert(T obj, int index)
{
base.Insert(obj, index);
lastIndex = index;
}
}
Unfortunately List's methods are not virtual, so you have to hide them and thus you have to use this class as the concrete SmartList; you can't use it as the value of a List-typed variable or parameter.

Related

Looking for a data structure that is optimized for finding the next closest element

I have two classes, let's call them foo and bar, that both have a DateTime property called ReadingTime.
I then have long lists of these classes, let's say foos and bars, where foos is List<foo>, bars is List<bar>.
My goal is for every element in foos to find the events in bars that happened right before and right after foo.
Some code to clarify:
var foos = new List<foo>();
var bars = new List<bar>();
...
foreach (var foo in foos)
{
bar before = bars.Where(b => b.ReadingTime <= foo.ReadingTime).OrderByDescending(b => b.ReadingTime).FirstOrDefault();
bar after = bars.Where(b => b.ReadingTime > foo.ReadingTime).OrderBy(b => b.ReadingTime).FirstOrDefault();
...
}
My issue here is performance. Is it possible to use some other data structure than a list to speed up the comparisons? In particular the OrderBy statement every single time seems like a huge waste, having it pre-ordered should also speed up the comparisons, right?
I just don't know what data structure is best, SortedList, SortedSet, SortedDictionary etc. there seem so many. Also all the information I find is on lookups, inserts, delets, etc., noone writes about finding the next closest element so I'm not sure if anything is optimized for that.
I'm on .net core 3.1 if that matters.
Thanks in advance!
Edit: Okay so to wrap this up:
First I tried implementing #derloopkat's approach. For this I figured I needed a data type that could save the data in a sorted order so I just left it as IOrderedEnumerable (which is what linq returns). Probably not very smart, as that actually brought things to a crawl. I then tried going with SortedList. Had to remove some duplicates first which was no problem in my case. Thanks for the help #Olivier Rogier! This got me up to roughly 2x the original performance, though I suspect it's mostly the removed linq OrderBys. For now this is good enough, if/when I need more performance I'm going to go with what #CamiloTerevinto suggested.
Lastly #Aldert thank you for your time but I'm too noob and under too much time pressure to understand what you suggested. Still appreciate it and might revisit this later.
Edit2: Ended up going with #CamiloTerevinto's suggestion. Cut my runtime down from 10 hours to a couple of minutes.
You don't need to sort bars ascending and descending on each iteration. Order bars just once before the loop by calling .OrderBy(f => f.ReadingTime) and then use LastOrDefault() and FirstOrDefault().
foreach (var foo in foos)
{
bar before = bars.LastOrDefault(b => b.ReadingTime <= foo.ReadingTime);
bar after = bars.FirstOrDefault(b => b.ReadingTime > foo.ReadingTime);
//...
}
This produces same output you get with your code and runs faster.
For memory performances and to have strong typing, you can use a SortedDictionary, or SortedList but it manipulates objects. Because you compare DateTime you don't need to implement comparer.
What's the difference between SortedList and SortedDictionary?
SortedList<>, SortedDictionary<> and Dictionary<>
Difference between SortedList and SortedDictionary in C#
For speed optimization you can use a double linked list where each item indicates the next and the previous items:
Doubly Linked List in C#
Linked List Implementation in C#
Using a linked list or a double linked list requires more memory because you store the next and the previous reference in a cell that embeed each instance, but you can have sometimes the most faster way to parse and compare data, as well as to search, sort, reorder, add, remove and move items, because you don't manipulate an array, but linked references.
You also can create powerfull trees and manage data in a better way than arrays.
You can use the binary sort for quick lookup. Below the code where bars is sorted and foo is looked up. You can do yourself some reading on binary searches and enhance the code by also sorting Foos. In this case you can minimize the search range of bars...
The code generates 2 lists with 100 items. then sorts bars and does a binary search for 100 times.
using System;
using System.Collections.Generic;
namespace ConsoleApp2
{
class BaseReading
{
private DateTime readingTime;
public BaseReading(DateTime dt)
{
readingTime = dt;
}
public DateTime ReadingTime
{
get { return readingTime; }
set { readingTime = value; }
}
}
class Foo:BaseReading
{
public Foo(DateTime dt) : base(dt)
{ }
}
class Bar: BaseReading
{
public Bar(DateTime dt) : base(dt)
{ }
}
class ReadingTimeComparer: IComparer<BaseReading>
{
public int Compare(BaseReading x, BaseReading y)
{
return x.ReadingTime.CompareTo(y.ReadingTime);
}
}
class Program
{
static private List<BaseReading> foos = new List<BaseReading>();
static private List<BaseReading> bars = new List<BaseReading>();
static private Random ran = new Random();
static void Main(string[] args)
{
for (int i = 0; i< 100;i++)
{
foos.Add(new BaseReading(GetRandomDate()));
bars.Add(new BaseReading(GetRandomDate()));
}
var rtc = new ReadingTimeComparer();
bars.Sort(rtc);
foreach (BaseReading br in foos)
{
int index = bars.BinarySearch(br, rtc);
}
}
static DateTime GetRandomDate()
{
long randomTicks = ran.Next((int)(DateTime.MaxValue.Ticks >> 32));
randomTicks = (randomTicks << 32) + ran.Next();
return new DateTime(randomTicks);
}
}
}
The only APIs available in the .NET platform for finding the next closest element, with a computational complexity better than O(N), are the List.BinarySearch and Array.BinarySearch methods:
// Returns the zero-based index of item in the sorted List<T>, if item is found;
// otherwise, a negative number that is the bitwise complement of the index of
// the next element that is larger than item or, if there is no larger element,
// the bitwise complement of Count.
public int BinarySearch (T item, IComparer<T> comparer);
These APIs are not 100% robust, because the correctness of the results depends on whether the underlying data structure is already sorted, and the platform does not check or enforce this condition. It's up to you to ensure that the list or array is sorted with the correct comparer, before attempting to BinarySearch on it.
These APIs are also cumbersome to use, because in case a direct match is not found you'll get the next largest element as a bitwise complement, which is a negative number, and you'll have to use the ~ operator to get the actual index. And then subtract one to get the closest item from the other direction.
If you don't mind adding a third-party dependency to your app, you could consider the C5 library, which contains the TreeDictionary collection, with the interesting methods below:
// Find the entry in the dictionary whose key is the predecessor of the specified key.
public bool TryPredecessor(K key, out SCG.KeyValuePair<K, V> res);
//Find the entry in the dictionary whose key is the successor of the specified key.
public bool TrySuccessor(K key, out SCG.KeyValuePair<K, V> res)
There are also the TryWeakPredecessor and TryWeakSuccessor methods available, that consider an exact match as a predecessor or successor respectively. In other words they are analogous to the <= and >= operators.
The C5 is a powerful and feature-rich library that offers lots of specialized collections, with its cons being its somewhat idiomatic API.
You should get excellent performance by any of these options.

Stackoverflow only with very large ArrayLists

I'm using a recursive version of the insertion sort algorithm to sort 5000 objects based upon a randomly generated integer property, but I've been getting a stackoverflow exception only at an ArrayList of this size while working fine for ArrayLists of other sizes.
I used Console.WriteLine to see what the "position" integer goes up to in one of my methods and it ends up at `4719 before skipping a line and giving a stackoverflow exception. How should I get around this?
I should also mention that when testing an iterative version of insertion sort in the same Visual Studio solution and using an ArrayList of the same size of objects I do not get a stackoverflow exception.
My code for the recursive insertion sort is below (AL is the ArrayList):
public void IS()
{
ISRM(0);
}
private void ISRM(int position)
{
if (position == AL.Count)
return;
Console.WriteLine(position);
int PositionNext = position + 1;
ISRMNext(position, PositionNext);
ISRM(position + 1);
}
private void ISRMNext(int position, int PositionNext)
{
if ((PositionNext == 0) || (PositionNext == AL.Count))
return;
Webpage EntryNext = (Webpage)AL[PositionNext];
Webpage EntryBefore = (Webpage)AL[PositionNext - 1];
if (EntryBefore.getVisitCount() < EntryNext.getVisitCount())
{
Webpage temp = EntryBefore;
AL[PositionNext - 1] = AL[PositionNext];
AL[PositionNext] = temp;
}
ISRMNext(position, PositionNext - 1);
}
Well, first of all, sorting through recursive call is a bad idea for several reasons.
As you've already found out, this easily leads to a stack overflow due to limited size of the stack.
It will have poor performance by definition since function call and accompanying allocation of local function context on the stack is much more expensive operation compared to something like while or for operators iterating through plain collection.
These are two reasons why #Zer0 probably suggested it, but there's more to it.
There's ready ArrayList.Sort() method waiting for you that takes custom comparator. All you need is to write said comparator for your custom objects according to whatever rules you want and call Sort(your_comparator). That's it. You do not need to re-invent the wheel implementing your own sorting method itself - unless implementing sorting method is the actual goal of your program... but I honestly doubt it.
So, It could be something like this (not tested!):
class MyComparer : IComparer
{
public int Compare(object x, object y)
{
var _x = ((Webpage) x).getVisitCount();
var _y = ((Webpage) y).getVisitCount();
if (_x < _y)
{
return -1;
}
if (_x > _y)
{
return 1;
}
return 0;
}
}
Usage:
var myAL = new ArrayList();
// ... filling up the myAL
myAL.Sort(new MyComparer());

Find next incremental value not in existing list using linq

I have two methods in an IntExtensions class to help generate the next available incremental value (which is not in a list of existing integers which need to be excluded).
I dont think I'm addressing the NextIncrementalValueNotInList method in the best way and am wondering if I can better use linq to return the next available int?
public static bool IsInList(this int value, List<int> ListOfIntegers) {
if (ListOfIntegers.Contains(value))
return true;
return false;
}
public static int NextIncrementalValueNotInList(this int value,
List<int> ListOfIntegers) {
int maxResult;
maxResult = ListOfIntegers.Max() + 1;
for (int i = value; i <= maxResult; i++)
{
if (!(i.IsInList(ListOfIntegers)))
{
return i;
}
}
return maxResult;
}
Using linq your method will look like:
return IEnumerable.Range(1, ListOfIntegers.Count + 1)
.Except(ListOfIntegers)
.First();
I guess it starting at 1.
You could also proceed like this:
IEnumerable.Range(1, ListOfIntegers.Count)
.Where(i => !ListOfIntegers.Contains(i))
.Union(new []{ ListOfIntegers.Count + 1 })
.First();
You don't actually need to calculate the Max value - just keep incrementing i until you find a value that doesn't exist in the list, e.g:
public static int NextIncrementalValueNotInList(this int value,
List<int> ListOfIntegers)
{
int i = value;
while(true)
{
if (!(i.IsInList(ListOfIntegers)))
{
return i;
}
i++;
}
return maxResult;
}
. Besides that, I'm not sure if there's much more you can do about this unless:
ListOfIntegers is guaranteed to be, or needs to be, sorted, or
ListOfIntegers doesn't actually need to be a List<int>
If the answer to the first is no, and to the second is yes, then you might instead use a HashSet<int>, which might provide a faster implementation by allowing you to simply use HashSet<T>'s own bool Contains(T) method:
public static int NextIncrementalValueNotInList(this int value,
HashSet<int> ListOfIntegers)
{
int i = value;
while(true)
{
if (!(ListOfIntegers.Contains(i))
{
return value;
}
i++;
}
}
Note that this version shows how to do away with the Max check also.
Although be careful of premature optimisation - if your current implementation is fast enough, then I wouldn't worry. You should properly benchmark any alternative solution with extreme cases as well as real-world cases to see if there's actually any difference.
Also what you don't want to do is use my suggestion above by turning your list into a HashSet for every call. I'm suggesting changing entirely your use of List to HashSet - any piecemeal conversion per-call will negate any potential performance benefits due to the overhead of creating the HashSet.
Finally, if you're not actually expecting much fragmentation in your integer list, then it's possible that a HashSet might not be much different from the current Linq version, because it's possibly going to end up doing similar amounts of work anyway.

Can I retrieve the stored value x in a hashset given an object y where x.Equals(y)

[TestFixture]
class HashSetExample
{
[Test]
public void eg()
{
var comparer = new OddEvenBag();
var hs = new HashSet<int>(comparer);
hs.Add(1);
Assert.IsTrue(hs.Contains(3));
Assert.IsFalse(hs.Contains(0));
// THIS LINE HERE
var containedValue = hs.First(x => comparer.Equals(x, 3)); // i want something faster than this
Assert.AreEqual(1, containedValue);
}
public class OddEvenBag : IEqualityComparer<int>
{
public bool Equals(int x, int y)
{
return x % 2 == y % 2;
}
public int GetHashCode(int obj)
{
return obj % 2;
}
}
}
As well as checking if hs contains an odd number, I want to know what odd number if contains. Obviously I want a method that scales reasonably and does not simply iterate-and-search over the entire collection.
Another way to rephrase the question is, I want to replace the line below THIS LINE HERE with something efficient (say O(1), instead of O(n)).
Towards what end? I'm trying to intern a laaaaaaaarge number of immutable reference objects similar in size to a Point3D. Seems like using a HashSet<Foo> instead of a Dictionary<Foo,Foo> saves about 10% in memory. No, obviously this isn't a game changer but I figured it would not hurt to try it for a quick win. Apologies if this has offended anybody.
Edit: Link to similar/identical post provided by Balazs Tihanyi in comments, put here for emphasis.
The simple answer is no, you can't.
If you want to retrieve the object you will need to use a HashSet. There just isn't any suitable method in the API to do what you are asking for otherwise.
One optimization you could make though if you must use a Set for this is to first do a contains check and then only iterate over the Set if the contains returns true. Still you would almost certainly find that the extra overhead for a HashMap is tiny (since essentially it's just another object reference).

Why do we need iterators in c#?

Can somebody provide a real life example regarding use of iterators. I tried searching google but was not satisfied with the answers.
You've probably heard of arrays and containers - objects that store a list of other objects.
But in order for an object to represent a list, it doesn't actually have to "store" the list. All it has to do is provide you with methods or properties that allow you to obtain the items of the list.
In the .NET framework, the interface IEnumerable is all an object has to support to be considered a "list" in that sense.
To simplify it a little (leaving out some historical baggage):
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
}
So you can get an enumerator from it. That interface (again, simplifying slightly to remove distracting noise):
public interface IEnumerator<T>
{
bool MoveNext();
T Current { get; }
}
So to loop through a list, you'd do this:
var e = list.GetEnumerator();
while (e.MoveNext())
{
var item = e.Current;
// blah
}
This pattern is captured neatly by the foreach keyword:
foreach (var item in list)
// blah
But what about creating a new kind of list? Yes, we can just use List<T> and fill it up with items. But what if we want to discover the items "on the fly" as they are requested? There is an advantage to this, which is that the client can abandon the iteration after the first three items, and they don't have to "pay the cost" of generating the whole list.
To implement this kind of lazy list by hand would be troublesome. We would have to write two classes, one to represent the list by implementing IEnumerable<T>, and the other to represent an active enumeration operation by implementing IEnumerator<T>.
Iterator methods do all the hard work for us. We just write:
IEnumerable<int> GetNumbers(int stop)
{
for (int n = 0; n < stop; n++)
yield return n;
}
And the compiler converts this into two classes for us. Calling the method is equivalent to constructing an object of the class that represents the list.
Iterators are an abstraction that decouples the concept of position in a collection from the collection itself. The iterator is a separate object storing the necessary state to locate an item in the collection and move to the next item in the collection. I have seen collections that kept that state inside the collection (i.e. a current position), but it is often better to move that state to an external object. Among other things it enables you to have multiple iterators iterating the same collection.
Simple example : a function that generates a sequence of integers :
static IEnumerable<int> GetSequence(int fromValue, int toValue)
{
if (toValue >= fromValue)
{
for (int i = fromValue; i <= toValue; i++)
{
yield return i;
}
}
else
{
for (int i = fromValue; i >= toValue; i--)
{
yield return i;
}
}
}
To do it without an iterator, you would need to create an array then enumerate it...
Iterate through the students in a class
The Iterator design pattern provides
us with a common method of enumerating
a list of items or array, while hiding
the details of the list's
implementation. This provides a
cleaner use of the array object and
hides unneccessary information from
the client, ultimately leading to
better code-reuse, enhanced
maintainability, and fewer bugs. The
iterator pattern can enumerate the
list of items regardless of their
actual storage type.
Iterate through a set of homework questions.
But seriously, Iterators can provide a unified way to traverse the items in a collection regardless of the underlying data structure.
Read the first two paragraphs here for a little more info.
A couple of things they're great for:
a) For 'perceived performance' while maintaining code tidiness - the iteration of something separated from other processing logic.
b) When the number of items you're going to iterate through is not known.
Although both can be done through other means, with iterators the code can be made nicer and tidier as someone calling the iterator don't need to worry about how it finds the stuff to iterate through...
Real life example: enumerating directories and files, and finding the first [n] that fulfill some criteria, e.g. a file containing a certain string or sequence etc...
Beside everything else, to iterate through lazy-type sequences - IEnumerators. Each next element of such sequence may be evaluated/initialized upon iteration step which makes it possible to iterate through infinite sequences using finite amount of resources...
The canonical and simplest example is that it makes infinite sequences possible without the complexity of having to write the class to do that yourself:
// generate every prime number
public IEnumerator<int> GetPrimeEnumerator()
{
yield return 2;
var primes = new List<int>();
primesSoFar.Add(2);
Func<int, bool> IsPrime = n => primes.TakeWhile(
p => p <= (int)Math.Sqrt(n)).FirstOrDefault(p => n % p == 0) == 0;
for (int i = 3; true; i += 2)
{
if (IsPrime(i))
{
yield return i;
primes.Add(i);
}
}
}
Obviously this would not be truly infinite unless you used a BigInt instead of int but it gives you the idea.
Writing this code (or similar) for each generated sequence would be tedious and error prone. the iterators do that for you. If the above example seems too complex for you consider:
// generate every power of a number from start^0 to start^n
public IEnumerator<int> GetPowersEnumerator(int start)
{
yield return 1; // anything ^0 is 1
var x = start;
while(true)
{
yield return x;
x *= start;
}
}
They come at a cost though. Their lazy behaviour means you cannot spot common errors (null parameters and the like) until the generator is first consumed rather than created without writing wrapping functions to check first. The current implementation is also incredibly bad(1) if used recursively.
Wiriting enumerations over complex structures like trees and object graphs is much easier to write as the state maintenance is largely done for you, you must simply write code to visit each item and not worry about getting back to it.
I don't use this word lightly - a O(n) iteration can become O(N^2)
An iterator is an easy way of implementing the IEnumerator interface. Instead of making a class that has the methods and properties required for the interface, you just make a method that returns the values one by one and the compiler creates a class with the methods and properties needed to implement the interface.
If you for example have a large list of numbers, and you want to return a collection where each number is multiplied by two, you can make an iterator that returns the numbers instead of creating a copy of the list in memory:
public IEnumerable<int> GetDouble() {
foreach (int n in originalList) yield return n * 2;
}
In C# 3 you can do something quite similar using extension methods and lambda expressions:
originalList.Select(n => n * 2)
Or using LINQ:
from n in originalList select n * 2
IEnumerator<Question> myIterator = listOfStackOverFlowQuestions.GetEnumerator();
while (myIterator.MoveNext())
{
Question q;
q = myIterator.Current;
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}
foreach (Question q in listOfStackOverFlowQuestions)
{
if (q.Pertinent == true)
PublishQuestion(q);
else
SendMessage(q.Author.EmailAddress, "Your question has been rejected");
}

Categories