Set operation in .NET C# - c#

I'm working on a something related to roughset right now. The project uses alot of sets operation and manipulation. I've been using string operations as a stop gap measure for set operation. It has worked fine until we need to process some ungodly amount of data ( 500,000 records with about 40+ columns each ) through the algorithm.
I know that there is no set data structure in .net 2.0(2.0 was the latest when I started the project) I want to know if there is any library that offer fast set operation in .net c# or if 3.5 has added native set data structure.
Thanks .

.NET 3.5 already has a native set data type: HashSet. You might also want to look at HashSet and LINQ set operators for the operations.
In .NET 1.0, there was a third party Set data type: Iesi.Collections which was extended with .NET 2.0 generics with Iesi.Collections.Generic.
You might want to try and look at all of them to see which one would benefit you the most. :)

LINQ supports some set operations. See LINQ 101 page for examples.
Also there is a class HashSet (.NET 3.5)
Here is Microsoft guidelines for set operations in .NET:
HashSet and LINQ Set Operations
List of set operations supported by HasSet class:
HashSet Collection Type

Update: This is for .Net 2.0. For .Net 3.5, refer posts by aku, Jon..
This is a good reference for efficiently representing sets in .Net.

It may be worth taking a look at C5, it's a generic collection library for .NET which includes sets.
Note that I haven't looked into it much, but it seems to be a pretty fantastic collection library.

Try HashSet in .NET 3.5.
This page from a member of the .NET BCL team has some good information on the intent of HashSet

I have been abusing the Dictionary class in .NET 2.0 as a set:
private object dummy = "ok";
public void Add(object el) {
dict[el] = dummy;
}
public bool Contains(object el) {
return dict.ContainsKey(el);
}

You can use Linq to Objects in C# 3.0.

You ever think about sing F#? This seems like a job for a functional programming language.

You should take a look at C5 Generic Collection Library. This library is a systematic approach to fix holes in .NET class library by providing missing structures, as well as replacing existing ones with set of well designed interfaces and generic classes.
Among others, there is HashSet<T> - generic Set class based on linear hashing.

Related

Removing LINQ-To-Objects from C# 3.5 project to convert it to .net 2.0

Are there any quick way to replace LINQ method calls such as (Concat,SequenceEqual,Skip,Take,......) from a project with their equivalents (such as static methods or any thing else)??
are there any .net-2 library to simulate these methods behavior??
You could use LINQBridge to work around only needing .NET 2, and still having the default LINQ to Objects behavior.
They are quite easy to hack together your self, shouldn't take you many hours if you just use a few of them.
I have some here if you want a start: http://sharpkom.svn.sourceforge.net/viewvc/sharpkom/ExternalComponents/LinqEx/LinqEx.cs?revision=1566&view=markup

What is IList used for?

I have already searched for it on MSDN and I read the definition that it is a non generic ....
but my problem is that i am not understanding what is its use and when to use it . So i hope that anyone can help me . Thanks .
In addition to the older code that doesn't know about generics, there are also a lot of cases where you know you have a list of data, but you can't know of what ahead of time. Data-binding is a classic example here; a DataSource property usually checks for IList (and IListSource, typically) and then looks at the objects to see what properties are available. It can't use generics in this case.
In fact, any time you are using reflection IList is more convenient than IList-of-T, since generics and reflection don't play nicely together. It can be done, but it is a pain. Unfortunately since IList-of-T doesn't derive from IList there are cases where this can fail - but it is a good 95% rule.
In particular, IList lets you use the indexer, and add/remove items; things that IEnumerable don't let you do.
Most app-level code (i.e. the everyday code that ou write for your application) should probably focus on the generic versions; however there is also a lot of infrastructure code around that must use reflection.
IList (non-generic version) is a leftover from C# version one, when we didn't have generics.
With C# 2 and .NET Framework 2.0, generics was added to the language and generic implementations of the collection types was added. Therefore, the non-generic versions was no longer needed, but they couldn't be removed from the framework, since it would make porting code from .NET 1.1 to 2.0 a pain.
Today, you almost always use IList<T>, the primary reason for IList to still be around is for reasons of backwards compatibility.
It is an interface that existed before generics were introduced in .NET 2.0.
It is there for backwards compatibility.
Use the generic version.
it is a interface , which is normally implemented by collection classes which is intended to provide list like method but in .net version less than 2.0 i.e. without generic introduction so if you are using framework version 1.0, 1.1 than you would see the usage of this IList

What would be a good replacement for C++ vector in C#?

I'm working on improving my skills in other languages, coming from using c++ as my primary programming language. My current project is hammering down C#.net, as I have heard it is a good in-between language for one who knows both c++ and VB.net.
Typically when working with an unknown number of elements in c++ I would declare my variable as a vector and just go from there. Vectors don't seem to exist in c#, and in my current program I have been using arraylists instead, but I'm starting to wonder if it's a good habit to use arraylists as I read somewhere that it was a carryover from .net 1.0
Long question short- what is the most commonly used listing type for c#?
If you target pre .NET 2.0 versions, use ArrayList
If you target .NET 2.0+ then use generic type List<T>
You may need to find replacements for other C++ standard containers, so here is possible mapping of C++ to .NET 2.0+ similar types or equivalents:
std::vector - List<T>
std::list - LinkedList<T>
std::map - Dictionary<K, V>
std::set - HashSet<T>
std::multimap - Dictionary<K, List<V>>
I would recommend you explore the System.Collections namespace, especially the System.Collections.Generics set of objects. The built-in functionality can be strongly typed across the various Lists, Dictionaries and NameValueCollections to provide you with a wide range of capabilities. They are also extendable so if they don't do EXACTLY what you need, you just extend them and add the new functionality.
That'd be List<T>, I suppose. ArrayList is the non-generic type and—as you correctly observed—a leftover from the .NET 1 times. Starting with .NET 2 you can use Generics and therefore List<T>.
Short answer: List<T>. You can find the docs here.

Is there an equivalent for Java WeakHashMap class in C#?

Is there a C# class that provides map with weak keys or/and weak values?
Or at least WeakHashMap like functionality.
In .Net 3.5 and below, there is no such structure available. However I wrote up one for a side project and posted the code at the following location.
Starting .NET 4.0, there is a structure available called ConditionalWeakTable in the Runtime.CompilerServices namespace that also does the trick.
Prior to .NET 4, the CLR did not provide the functionality necessary to implement a map of this form. In particular, Java provides the ReferenceQueue<T> class, which WeakHashMap uses to manage the weak keys in the map. Since there is no equivalent to this class in .NET, there is no clean way to build an equivalent Dictionary.
In .NET 4, a new class ConditionalWeakTable<TKey, TValue> was added as part of an effort to improve the ability of the CLR to support dynamic languages. This class uses a new type of garbage collection handle, which is implemented within the CLR itself and exposed in mscorlib.dll through the internal DependentHandle structure.
This means the following for you:
There is no equivalent to WeakHashMap prior to .NET 4.
Starting with .NET 4, and continuing at least through .NET 4.5.1, the only way to support the functionality of WeakHashMap is to use the ConditionalWeakTable class (which is sealed).
Additional information is found in the following post:
Is it possible to create a truely weak-keyed dictionary in C#?
The closest platform equivalent is probably a Dictionary<K, WeakReference<V>>. That is, it's just a regular dictionary, but with the values being weak references.

C# Data Structure Like Dictionary But Without A Value

Is there any data structure in C# that is like a dictionary but that only has a key and doesn't have a value. I basically want a list of integers that I can quickly lookup and see if a certain value is in the list. Granted, for my current use, a List would not cause any performance problem, but it just doesn't seem to fit well with the intent of what my code is doing.
Yes, it's called a HashSet<T>, and available in version 3.5 of the .NET framework. If you use .NET version 2.0, you can use a Dictionary and set values to null.
If 3.5 is not an option you could do something like Dictionary < int, int > and simply ignore the value. i've done this in 2.0 and i tend to set the value to the same as the key.
If you're not targeting .NET 3.5, Power Collections (open source) also provides a Set implementation.
or use a SortedList where values have to be unique

Categories