Efficient way to iterate over an array for a specific member

Efficient way to iterate over an array for a specific member - c#

In my last project I have found myself iterating over many Arrays or Lists of strings in order to find a specific string within.
I have got to ask, is there any way less than O(n) in order to find a specific member inside an array?
An O(n) solution in C# (consider there are no duplicates):
foreach(string st in Arr)
{
if (st=="Hello")
{
Console.WriteLine("Hey!");
break;
}
}
EDIT : I didn't ask my question quite right. I wish to also change the member I wish to find and not only look him up.
So my snippet changes to:
foreach(string st in Arr)
{
if (st=="Hello")
{
st="Changed";
break;
}
}
Can you somehow get to O(logn)? If so, can you further explain how it is done and how is it more effiecient than my solution.
Thanks for any light on that matter!

is there any way less than O(n) in order to find a specific member
inside an array?
If you have a collection of strings with no regards to duplicates, and sort isn't a matter, you should use a HashSet<T>, where with a normal distribution you should be at O(1) for lookups:
var hashSet = new HashSet<string> { "A", "B", "C" };
if (hashSet.Contains("A"))
{
Console.WriteLine("hey");
}
In case you need more than lookups, e.g accessing a member at a specific index, then HashSet<T> should not be your pick. You need to specify exactly what you're going to be doing with the collection if you want a more elaborate answer.

There is a project called "C5 Generic Collection Library" at the University of Copenhagen. They have implemented an extended set of collection classes that may help you such as a HashSet that allows duplicates called hashbag and they have a hash-indexed array list which may prove useful...

Related

Is the IEnumerator of a dictionary guaranteed to be consistent?

So, I've found a piece of code like this:
class CustomDictionary
{
Dictionary<string, string> backing;
...
public string Get(int index)
{
return backing.ElementAtOrDefault(index); //use linq extensions on IEnumerable
}
}
And then this was used like so:
for(int i=0;i<mydictionary.Count;i++)
{
var value=mydictionary.Get(i);
}
Aside from the performance problems and uglyness of doing it this way, is this code actually correct? Ie, is the IEnumerable on Dictionary guaranteed to always return things in the same order assuming that nothing is modified with the dictionary during the iteration?

This is NOT guaranteed. It is for a SortedDictionary<>, of course, and also for arrays and lists. But NOT for a Dictionary.
Chances are, it will be stable if the dictionary is not changed - but it's just not guaranteed. You have to ask yourself - do you feel lucky? ;)

If you want to get the elements by the order they were inserted then you should probably look into the Stack and Queue depending on what elements you want first.

Yes you'll get the same items.
As you specified, the method you presented is very inefficient way to do so.
ElementAtOrDefault is a LINQ extension method for IEnumerable, means for each item it will iterate all way to the specified item.

Find Item in ObservableCollection without using a loop

Currently i have the following syntax (list is a list containing objects with many different properties (where Title is one of them):
for (int i=0; i < list.Count; i++)
{
if(title == list[i].Title)
{
//do something
}
}
How can i access the list[i].Title without having to loop over my entire collection? Since my list tends to grow large this can impact the performance of my program.
I am having a lot of similar syntax across my program (accessing public properties trough a for loop and by index). But im a sure there must be a better and elegant way of doing this?
The find method does seem to be a option since my list contains objects.

I Don't know what do you mean exactly, but technially speaking, this is not possible without a loop.
May be you mean using a LINQ, like for example:
list.Where(x=>x.Title == title)
It's worth mentioning that the iteration over is not skipped, but simply wrapped into the LINQ query.
Hope this helps.
EDIT
In other words if you really concerned about performance, keep coding the way you already doing. Otherwise choose LINQ for more concise and clear syntax.

Here comes Linq:
var listItem = list.Single(i => i.Title == title);
It throws an exception if there's no item matching the predicate. Alternatively, there's SingleOrDefault.
If you want a collection of items matching the title, there's:
var listItems = list.Where(i => i.Title == title);

i had to use it for a condition add if you don't need the index
using System.Linq;
use
if(list.Any(x => x.Title == title){
// do something here
}
this will tell you if any variable satisfies your given condition.

I'd suggest storing these in a Hashtable. You can then access an item in the collection using the key, it's a much more efficient lookup.
var myObjects = new Hashtable();
myObjects.Add(yourObject.Title, yourObject);
...
var myRetrievedObject = myObjects["TargetTitle"];

Consider creating an index. A dictionary can do the trick. If you need the list semantics, subclass and keep the index as a private member...

ObservableCollection is a list so if you don't know the element position you have to look at each element until you find the expected one.
Possible optimization
If your elements are sorted use a binary search to improve performances otherwise use a Dictionary as index.

You're looking for a hash based collection (like a Dictionary or Hashset) which the ObservableCollection is not. The best solution might be to derive from a hash based collection and implement INotifyCollectionChanged which will give you the same behavior as an ObservableCollection.

Well if you have N objects and you need to get the Title of all of them you have to use a loop. If you only need the title and you really want to improve this, maybe you can make a separated array containing only the title, this would improve the performance.
You need to define the amount of memory available and the amount of objects that you can handle before saying this can damage the performance, and in any case the solution would be changing the design of the program not the algorithm.

Maybe this approach would solve the problem:
int result = obsCollection.IndexOf(title);
IndexOf(T)
Searches for the specified object and returns the zero-based index of the first occurrence within the entire Collection.
(Inherited from Collection)
https://learn.microsoft.com/en-us/dotnet/api/system.collections.objectmodel.observablecollection-1?view=netframework-4.7.2#methods

An observablecollection can be a List
{
BuchungsSatz item = BuchungsListe.ToList.Find(x => x.BuchungsAuftragId == DGBuchungenAuftrag.CurrentItem.Id);
}

Is it possible to loop input into an array without setting how big the array is in C#?

It's probably a really basic question, but I can't find the answer anywhere.
I'm trying to loop through the input and put the results into an array using C#. From what I read the array has to have the number of elements set first.
Is there a way to just loop through and have the array number of elements dependent on the number of inputs?
TIA

Use a List object so that you can add the results of each iteration in your loop to the List until you have processed all of the input. Then you won't have to keep track of indices / sizes of the array.
The List class has a ToArray() method that you can use after the loop, if you want to store the results in an array. You can either return the array from your method, or clear out the List object after converting it to an array, in order to keep from having extra copies of the data lying around.

If you know the input length, you can initialize the array with that value. Otherwise, you should use a List for a strongly typed collection or an ArrayList.
If your input is an IEnumerable, you can use List.AddRange to add all the values without the need to loop.
You should note that the List and the ArrayList works exactly the way you want. As the docs states, they use "an array whose size is dynamically increased as required".

You want to use a List.

And even better, use generic lists whenever possible. Since they came out in 2.0, I don't think I've ever used a plain ArrayList.
List<string> myNames = new List<string>();
myNames.Add("kevin");
string[] myNamesTurnedToStringArray = nyMames.ToArray();
Taking things one step further, I find that nearly every time I need a collection, I need it for more then a few simple statements. If you find that you are hitting your list all over your codebase, you might consider subclassing the List
public class Widgets : List<Widgets> // Widgest class defined elsewhere
{
public Widgets() : base() {}
public Widgets(IEnumerable<Widgets> items) : base(items) {}
public Widget GetWidgetById(int id)
{
// assuming only one ID possible
return this.Where(w => w.id == id).Single();
}
public string[] GetAllNames()
{
return this.Select(w => w.Name).ToArray();
}
}
//... later in some other class ....
Widgets w = new Widgets(dataFromDatabase);
Widget customerWidget = w.GetWidgetById(customerWidgetId);
string[] allNames = w.GetAllNames();
This is a nice OOPy way of encapsulating your list functionality in one place & often is a great way to separate your concerns organically.

have a look at the ArrayList collection. You can give it an an initial allocation size, and you can also Add items to it.
The capacity of a ArrayList is the number of elements the ArrayList can hold. As elements are added to a ArrayList, the capacity is automatically increased as required through reallocation. The capacity can be decreased by calling TrimToSize or by setting the Capacity property explicitly.
Of course, it will be more efficient to preallocate it to a reasonable size than to add items one at a time from 0 - but it will work either way.

Fastest way to find out whether two ICollection<T> collections contain the same objects

What is the fastest way to find out whether two ICollection<T> collections contain precisely the same entries? Brute force is clear, I was wondering if there is a more elegant method.
We are using C# 2.0, so no extension methods if possible, please!
Edit: the answer would be interesting both for ordered and unordered collections, and would hopefully be different for each.

use C5
http://www.itu.dk/research/c5/
ContainsAll
" Check if all items in a
supplied collection is in this bag
(counting multiplicities).
The
items to look for.
True if all items are
found."
[Tested]
public virtual bool ContainsAll<U>(SCG.IEnumerable<U> items) where U : T
{
HashBag<T> res = new HashBag<T>(itemequalityComparer);
foreach (T item in items)
if (res.ContainsCount(item) < ContainsCount(item))
res.Add(item);
else
return false;
return true;
}

First compare the .Count of the collections if they have the same count the do a brute force compare on all elements. Worst case scenarios is O(n). This is in the case the order of elements needs to be the same.
The second case where the order is not the same, you need to use a dictionary to store the count of elements found in the collections: Here's a possible algorithm
Compare collection Count : return false if they are different
Iterate the first collection
If item doesn't exist in dictionary then add and entry with Key = Item, Value = 1 (the count)
If item exists increment the count for the item int the dictionary;
Iterate the second collection
If item is not in the dictionary the then return false
If item is in the dictionary decrement count for the item
If count == 0 the remove item;
return Dictionary.Count == 0;

For ordered collections, you can use the SequenceEqual() extension method defined by System.Linq.Enumerable:
if (firstCollection.SequenceEqual(secondCollection))

You mean the same entries or the same entries in the same order?
Anyway, assuming you want to compare if they contain the same entries in the same order, "brute force" is really your only option in C# 2.0. I know what you mean by non elegant, but if the atomic comparision itself is O(1), the whole process should be in O(N), which is not that bad.

If the entries need to be in the same order (besides being the same), then I suggest - as an optimization - that you iterate both collections at the same time and compare the current entry in each collection. Otherwise, the brute force is the way to go.
Oh, and another suggestion - you could override Equals for the collection class and implement the equality stuff in there (depends on you project, though).

Again, using the C5 library, having two sets, you could use:
C5.ICollection<T> set1 = C5.ICollection<T> ();
C5.ICollection<T> set2 = C5.ICollecton<T> ();
if (set1.UnsequencedEquals (set2)) {
// Do something
}
The C5 library includes a heuristic that actually tests the unsequenced hash codes of the two sets first (see C5.ICollection<T>.GetUnsequencedHashCode()) so that if the hash codes of the two sets are unequal, it doesn't need to iterate over every item to test for equality.
Also something of note to you is that C5.ICollection<T> inherits from System.Collections.Generic.ICollection<T>, so you can use C5 implementations while still using the .NET interfaces (though you have access to less functionality through .NET's stingy interfaces).

Brute force takes O(n) - comparing all elements (assuming they are sorted), which I would think is the best you could do - unless there is some property of the data that makes it easier.
I guess for the case of not sorted, its O(n*n).
In which case, I would think a solution based around a merge sort would probably help.
For example, could you re-model it so that there was only one collection? Or 3 collections, one for those in collection A only, one for B only and for in both - so if the A only and B only are empty - then they are the same... I am probably going off on totally the wrong tangent here...

Merging two Collection<T>

I got a Function that returns a Collection<string>, and that calls itself recursively to eventually return one big Collection<string>.
Now, i just wonder what the best approach to merge the lists? Collection.CopyTo() only copies to string[], and using a foreach() loop feels like being inefficient. However, since I also want to filter out duplicates, I feel like i'll end up with a foreach that calls Contains() on the Collection.
I wonder, is there a more efficient way to have a recursive function that returns a list of strings without duplicates? I don't have to use a Collection, it can be pretty much any suitable data type.
Only exclusion, I'm bound to Visual Studio 2005 and .net 3.0, so no LINQ.
Edit: To clarify: The Function takes a user out of Active Directory, looks at the Direct Reports of the user, and then recursively looks at the direct reports of every user. So the end result is a List of all users that are in the "command chain" of a given user.Since this is executed quite often and at the moment takes 20 Seconds for some users, i'm looking for ways to improve it. Caching the result for 24 Hours is also on my list btw., but I want to see how to improve it before applying caching.

If you're using List<> you can use .AddRange to add one list to the other list.
Or you can use yield return to combine lists on the fly like this:
public IEnumerable<string> Combine(IEnumerable<string> col1, IEnumerable<string> col2)
{
foreach(string item in col1)
yield return item;
foreach(string item in col2)
yield return item;
}

You might want to take a look at Iesi.Collections and Extended Generic Iesi.Collections (because the first edition was made in 1.1 when there were no generics yet).
Extended Iesi has an ISet class which acts exactly as a HashSet: it enforces unique members and does not allow duplicates.
The nifty thing about Iesi is that it has set operators instead of methods for merging collections, so you have the choice between a union (|), intersection (&), XOR (^) and so forth.

I think HashSet<T> is a great help.
The HashSet<T> class provides
high performance set operations. A set
is a collection that contains no
duplicate elements, and whose elements
are in no particular order.
Just add items to it and then use CopyTo.
Update: HashSet<T> is in .Net 3.5
Maybe you can use Dictionary<TKey, TValue>. Setting a duplicate key to a dictionary will not raise an exception.

Can you pass the Collection into you method by refernce so that you can just add items to it, that way you dont have to return anything. This is what it might look like if you did it in c#.
class Program
{
static void Main(string[] args)
{
Collection<string> myitems = new Collection<string>();
myMthod(ref myitems);
Console.WriteLine(myitems.Count.ToString());
Console.ReadLine();
}
static void myMthod(ref Collection<string> myitems)
{
myitems.Add("string");
if(myitems.Count <5)
myMthod(ref myitems);
}
}
As Stated by #Zooba Passing by ref is not necessary here, if you passing by value it will also work.

As far as merging goes:
I wonder, is there a more efficient
way to have a recursive function that
returns a list of strings without
duplicates? I don't have to use a
Collection, it can be pretty much any
suitable data type.
Your function assembles a return value, right? You're splitting the supplied list in half, invoking self again (twice) and then merging those results.
During the merge step, why not just check before you add each string to the result? If it's already there, skip it.
Assuming you're working with sorted lists of course.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Efficient way to iterate over an array for a specific member - c#

There is a project called "C5 Generic Collection Library" at the University of Copenhagen. They have implemented an extended set of collection classes that may help you such as a HashSet that allows duplicates called hashbag and they have a hash-indexed array list which may prove useful...

Related

Is the IEnumerator of a dictionary guaranteed to be consistent?

Find Item in ObservableCollection without using a loop

Is it possible to loop input into an array without setting how big the array is in C#?

Fastest way to find out whether two ICollection<T> collections contain the same objects

Merging two Collection<T>

Categories

Resources