check for duplicates in arraylist - c#

I have an arraylist that contains items called Room. Each Room has a roomtype such as kitchen, reception etc.
I want to check the arraylist to see if any rooms of that type exist before adding it to the list.
Can anyone recommend a neat way of doing this without the need for multiple foreach loops?
(.NET 2.0)
I havent got access to the linq technology as am running on .net 2.0. I should have stated that in the question.
Apologies

I would not use ArrayList here; since you have .NET 2.0, use List<T> and all becomes simple:
List<Room> rooms = ...
string roomType = "lounge";
bool exists = rooms.Exists(delegate(Room room) { return room.Type == roomType; });
Or with C# 3.0 (still targetting .NET 2.0)
bool exists = rooms.Exists(room => room.Type == roomType);
Or with C# 3.0 and either LINQBridge or .NET 3.5:
bool exists = rooms.Any(room => room.Type == roomType);
(the Any usage will work with more types, not just List<T>)

if (!rooms.Any (r => r.RoomType == typeToFind /*kitchen, ...*/))
//add it or whatever

From your question it's not 100% clear to me if you want to enforce the rule that there may be only one room of a given type, or if you simply want to know.
If you have the invariant that no collection of Rooms may have more than one of the same Room type, you might try using a Dictionary<Type, Room>.
This has the benefit of not performing a linear search on add.
You would add a room using the following operations:
if(rooms.ContainsKey(room.GetType()))
{
// Can't add a second room of the same type
...
}
else
{
rooms.Add(room.GetType(), room);
}

Without using lambda expressions:
void AddRoom(Room r, IList<Room> rooms, IDictionary<string, bool> roomTypes)
{
if (!roomTypes.Contains(r.RoomType))
{
rooms.Add(r);
roomTypes.Add(r.RoomType, true);
}
}
It doesn't actually matter what the type of the value in the dictionary is, because the only thing you're ever looking at is the keys.

Another way is to sort the array, then walk the elements until you find a pair of adjacent duplicates. Make it to the end, and the array is dupe-free.

I thought using lists and doing Exists where an operation that takes O(n) time.
Using Dictionary instead is O(1) and is preferred if memory is not a problem.
If you do not need the sequential List I would try using a Dictionary like this:
Dictionary<Type, List<Room>> rooms = new Dictionary<Type, List<Room>>;
void Main(){
KitchenRoom kr = new KitchenRoom();
DummyRoom dr = new DummyRoom();
RoomType1 rt1 = new RoomType1();
...
AddRoom(kr);
AddRoom(dr);
AddRoom(rt1);
...
}
void AddRoom(Room r){
Type roomtype = r.GetType();
if(!rooms.ContainsKey(roomtype){ //If the type is new, then add it with an empty list
rooms.Add(roomtype, new List<Room>);
}
//And of course add the room.
rooms[roomtype].Add(r);
}
You basically have a list of different roomtypes. But this solution is only OK if you don't need the arraylist. But for large lists this will be the fastest one.
I had a solution once with List<string> with 300.000+ items. Comparing each element with another list of almost the same size took humongous 12hrs to do. Switched the logic to using Dictionary instead and down to 12 minutes. For larger lists I always go Dictionary<mytype, bool> where bool is just a dummy not being used.

Related

list property containing two types (string/int)

I need to have a property that will be an array that can hold both ints and strings.
if i set the property to an array of ints it should be ints so when I am searching through this array the search will be fast, and at odd times this property will also contain strings which the search will be slow.
Is there any other way other than the following to have a list that contain native types
two properties one for ints and one for strings
use List< object >
UPDATE:
The use-case is as follow. I have a database field [ReferenceNumber] that holds the values (integers and strings) and another field [SourceID] (used for other things) which can be used to determine if record holds an int or string.
I will be fetching collections of these records based on the source id, of course depending on what the source is, the list either will be integers or strings. Then I will go through this collection looking for certain reference numbers, if they exist not add them or they dont then add them. I will be pre-fetching a lot of records instead of hitting the database over and over.
so for example if i get a list for sourceid =1 that means they are ints and if searching i want the underline list to be int so the search will be fast. and if sourceid say is 2 which means they are strings and very rare its okay if the search is slow because those number of records are not that many and a performance hit on searching through strings is okay.
I will go through this collection looking for certain reference numbers, if they exist not add them or they dont then add them.
It sounds to me like you don't need a List<>, but rather a HashSet<>. Simply use a HashSet<object>, and Add() all the items, and the collection will ignore duplicate items. It will be super-fast, regardless of whether you're dealing with ints or strings.
On my computer, the following code shows that it takes about 50 milliseconds to populate an initial 400,000 unique strings in the hashset, and about 2 milliseconds to add an additional 10,000 random strings:
var sw = new Stopwatch();
var initial= Enumerable.Range(1, 400000).Select(i => i.ToString()).ToList();
sw.Start();
var set = new HashSet<object>(initial);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
var random = new Random();
var additional = Enumerable.Range(1, 10000).Select(i => random.Next(1000000).ToString()).ToList();
sw.Restart();
foreach (var item in additional)
{
set.Add(item);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Also, in case it's important, HashSet<>s do retain order of insertion.
The only other thing I would suggest is a custom object that implements IComparable
class Multitype: IComparable
{
public int? Number { get; set; }
public string Words {get; set; }
public int CompareTo(object obj)
{
Multitype other = obj as Multitype;
if (Number != null && other != null && other.Number != null)
{
//...
}
else
{
//...
}
}
}
There will be some extra comparison steps between numbers, but not as much as string parsing.
Are you storing a ton of data, is that performance difference really going to matter?
It's possible to use generics if you implement them on the class. Not sure if this solves your problem. Would be interested to hear the real-world example of a property that can have different types.
class Foo<T>
{
public List<T> GenericList { get; set; }
public Foo()
{
this.GenericList = new List<T>();
}
}
If by "use List" you mean the object primitive or provided System.Object, that is an option, but I think it would behoove you to make your own wrapper object -- IntString or similar -- that would handle everything for you. It should implement IComparable, as the other gal mentioned.
You can increase the efficiency of sorting your object in collections by writing a CompareTo method that does exactly what you need it to. Writing a good CompareTo method is a whole can of worms in itself, so you should probably start a new question for that, if that's what you want.
If you're looking for a property that is strongly typed as a List<Int> or List<String> at instantiation, but can change afterwards, then you want an interface. IList exists, but won't help you, since that must also be strongly typed upon declaration. You should probably make something like IIntStringList that can only be one of List<Int> or List<String>.
Sorry this answer doesn't have that many details (I need to leave the office now), but I hope I've set you on the right track.

Store variables in list/array and loop through later c#

Sorry, I think I was not clear earlier. I am trying to do as O.R.mapper says below- create a list of arbitrary variables and then get their values later in foreach loop.
Moreover, all variables are of string type so I think can come in one list. Thanks.
Is there a way to store variables in a list or array then then loop through them later.
For example: I have three variables in a class c named x,y and Z.
can I do something like:
public List Max_One = new List {c.x,c.y,c.z}
and then later in the code
foreach (string var in Max_One)
{
if ((var < 0) | (var > 1 ))
{
// some code here
}
}
Is there a particular reason why you want to store the list of variables beforehand? If it is sufficient to reuse such a list whenever you need it, I would opt for creating a property that returns an IEnumerable<string>:
public IEnumerable<string> Max_One {
get {
yield return c.x;
yield return c.y;
yield return c.z;
}
}
The values returned in this enumerable would be retrieved only when the property getter is invoked. Hence, the resulting enumerable would always contain the current values of c.x, c.y and c.z.
You can then iterate over these values with a foreach loop as alluded to by yourself in your question.
This might not be practical if you need to gradually assemble the list of variables; in that case, you might have to work with reflection. If this is really required, please let me know; I can provide an example for that, but it will become more verbose and complex.
Yes, e.g. if they are all strings:
public List<string> Max_One = new List<string> {c.x,c.y,c.z};
This uses the collection initializer syntax.
It doesn't make sense to compare a string to an int, though. This is a valid example:
foreach (string var in Max_One)
{
if (string.IsNullOrEmpty(var))
{
// some code here
}
}
If your properties are numbers (int, for example) you can do this:
List<int> Max_One = new List<int> { c.x, c.y, c.Z };
and use your foreach like this
foreach(int myNum in Max_One) { ... } //you can't name an iterator 'var', it's a reserved word
Replace int in list declaration with the correct numeric type (double, decimal, etc.)
You could try using:
List<object> list = new List<object>
{
c.x,
c.y,
c.z
};
I will answer your question in reverse way
To start with , you cannot name your variable with "var" since it is reserved name. So what you can do for the foreach is
foreach (var x in Max_One)
{
if ((x< 0) || (x> 1 ))
{
// some code here
}
}
if you have .Net 3.0 and later framework, you can use "var" to define x as a member of Max_One list without worrying about the actual type of x. if you have older than the version 3.0 then you need to specify the datatype of x, and in this case your code is valid (still risky though)
The last point (which is the your first point)
public List Max_One = new List {c.x,c.y,c.z}
There are main thing you need to know , that is in order to store in a list , the members must be from the same datatype, so unless a , b , and c are from the same datatype you cannot store them in the same list EXCEPT if you defined the list to store elements of datatype "object".
If you used the "Object" method, you need to cast the elements into the original type such as:
var x = (int) Max_One[0];
You can read more about lists and other alternatives from this website
http://www.dotnetperls.com/collections
P.s. if this is a homework, then you should read more and learn more from video tutorials and books ;)

Can't make simple list operation in C#

I'm trying to make a function with list.
It is to sort and delete duplicates.
It sorts good, but don't delete duplictates.
What's the problem?
void sort_del(List<double> slist){
//here i sort slist
//get sorted with duplicates
List<double> rlist = new List<double>();
int new_i=0;
rlist.Add(slist[0]);
for (i = 0; i < size; i++)
{
if (slist[i] != rlist[new_i])
{
rlist.Add(slist[i]);
new_i++;
}
}
slist = new List<double>(rlist);
//here get without duplicates
}
It does not work because slist is passed by value. Assigning rlist to it has no effect on the caller's end. Your algorithm for detecting duplicates seems fine. If you do not want to use a more elegant LINQ way suggested in the other answer, change the method to return your list:
List<double> sort_del(List<double> slist){
// Do your stuff
return rlist;
}
with double you can just use Distinct()
slist = new List<double>(rlist.Distinct());
or maybe:
slist.Distinct().Sort();
You're not modifying the underlying list. You're trying to add to a new collection, and you're not checking if the new one contains one of the old ones correctly.
If you're required to do this for homework (which seems likely, as there are data structures and easy ways to do this with LINQ that others have pointed out), you should break the sort piece and the removal of duplication into two separate methods. The methods that removes duplicates should accept a list as a parameter (as this one does), and return the new list instance without duplicates.

Convert IList<T1> to IList<BaseT1>

As the easiest way to convert the IList<T1> to IList<BaseT1>?
IList<T1>.Count() is very large number!!!
class BaseT1 { };
class T1 : BaseT1
{
static public IList<BaseT1> convert(IList<T1> p)
{
IList<BaseT1> result = new List<BaseT1>();
foreach (BaseT1 baseT1 in p)
result.Add(baseT1);
return result;
}
}
You'll get much better performance in your implementation if you specify the size of the result list when it is initalized, and call the Add method on List<T> directly:
List<BaseT1> result = new List<BaseT1>(p.Count);
that way, it isn't resizing lots of arrays when new items get added. That should yield an order-of-magnitude speedup.
Alternatively, you could code a wrapper class that implements IList<BaseT1> and takes an IList<T1> in the constructor.
linq?
var baseList = derivedList.Cast<TBase>();
Edit:
Cast returns an IEnumerable, do you need it in a List? List can be an expensive class to deal with
IList<T1>.Count() is very large number!!!
Yes, which means that no matter what syntax sugar you use, the conversion is going to require O(n) time and O(n) storage. You cannot cast the list to avoid re-creating it. If that was possible, client code could add an element of BaseT1 to the list, violating the promise that list only contains objects that are compatible with T1.
The only way to get ahead is to return an interface type that cannot change the list. Which would be IEnumerable<BaseT1> in this case. Allowing you to iterate the list, nothing else. That conversion is automatic in .NET 4.0 thanks to its support for covariance. You'll have to write a little glue code in earlier versions:
public static IEnumerable<BaseT1> enumerate(IList<T1> p) {
foreach (BaseT1 item in p) yield return item;
}

Removing duplicate string from List (.NET 2.0!)

I'm having issues finding the most efficient way to remove duplicates from a list of strings (List).
My current implementation is a dual foreach loop checking the instance count of each object being only 1, otherwise removing the second.
I know there are MANY other questions out there, but they all the best solutions require above .net 2.0, which is the current build environment I'm working in. (GM and Chrysler are very resistant to changes ... :) )
This limits the possible results by not allowing any LINQ, or HashSets.
The code I'm using is Visual C++, but a C# solution will work just fine as well.
Thanks!
This probably isn't what you're looking for, but if you have control over this, the most efficient way would be to not add them in the first place...
Do you have control over this? If so, all you'd need to do is a myList.Contains(currentItem) call before you add the item and you're set
You could do the following.
List<string> list = GetTheList();
Dictionary<string,object> map = new Dictionary<string,object>();
int i = 0;
while ( i < list.Count ) {
string current = list[i];
if ( map.ContainsKey(current) ) {
list.RemoveAt(i);
} else {
i++;
map.Add(current,null);
}
}
This has the overhead of building a Dictionary<TKey,TValue> object which will duplicate the list of unique values in the list. But it's fairly efficient speed wise.
I'm no Comp Sci PhD, but I'd imagine using a dictionary, with the items in your list as the keys would be fast.
Since a dictionary doesn't allow duplicate keys, you'd only have unique strings at the end of iteration.
Just remember when providing a custom class to override the Equals() method in order for the Contains() to function as required.
Example
List<CustomClass> clz = new List<CustomClass>()
public class CustomClass{
public bool Equals(Object param){
//Put equal code here...
}
}
If you're going the route of "just don't add duplicates", then checking "List.Contains" before adding an item works, but its O(n^2) where n is the number strings you want to add. Its no different from your current solution using two nested loops.
You'll have better luck using a hashset to store items you've already added, but since you're using .NET 2.0, a Dictionary can substitute for a hash set:
static List<T> RemoveDuplicates<T>(List<T> input)
{
List<T> result = new List<T>(input.Count);
Dictionary<T, object> hashSet = new Dictionary<T, object>();
foreach (T s in input)
{
if (!hashSet.ContainsKey(s))
{
result.Add(s);
hashSet.Add(s, null);
}
}
return result;
}
This runs in O(n) and uses O(2n) space, it will generally work very well for up to 100K items. Actual performance depends on the average length of the strings -- if you really need to maximum performance, you can exploit some more powerful data structures like tries make inserts even faster.

Categories