I am trying to generate unique integer Ids that can be used from multiple threads.
public partial class Form1 : Form
{
private static int sharedInteger;
...
private static int ModifySharedIntegerSingleTime()
{
int unique = Interlocked.Increment(ref sharedInteger);
return unique;
}
SimulateBackTestRow1()
{
while (true)
{
int num = ModifySharedIntegerSingleTime();
}
}
SimulateBackTestRow2()
{
while (true)
{
int num = ModifySharedIntegerSingleTime();
}
}
Task modifyTaskOne = Task.Run(() => SimulateBackTestRow1());
Task modifyTaskTwo = Task.Run(() => SimulateBackTestRow2());
However, when code that takes a unique number that has not been used before gets passed a number that was acquired by ModifySharedIntegerSingleTime, I am getting collisions with numbers that are not unique.
What is the correct way to get unique int Ids in a thread-safe way?
You only have 2^32 unique values with an integer. Since you're generating values as quickly as you can in a loop, it wont take you long to generate more than the ~4 billion values you need to have run out and start returning values you've already used and causing collisions. Either you have some bug in your program that results in you generating more new values than you actually need, or you need to use a type that has more values, such as a long or GUID.
Related
I have a SQL column with datatype as Bigint, so when I 'm generating a unique number using C# Random class in a multithreaded environment, I see there are duplicate values instead of having unique, I can see only system GUID is the only option to generate unique Id's, could you please help me to solve this problem.
private Object thisLock = new Object();
public Random a = new Random(DateTime.Now.Ticks.GetHashCode());
private void NewNumber()
{
lock (thisLock)
{
MyNumber = a.Next(0, 10);
}
}
The Random class generates random values, not unique values. In your sample code the maximum number of unique integers is 10 (from 0 to 9). So if you called this method at least 11 times, you would be guaranteed one or more duplicates.
For a database you should use identity columns.
Your code should normally run without error, and I did not find an error. The duplicate numbers may be due to their small range.
You can use the following class to generate random numbers between ranges. Returns a number from the desired range each time.
class UniqueRandom
{
private readonly List<int> _currentList;
private readonly Random _random = new Random();
public UniqueRandom(IEnumerable<int> seed)
{
_currentList = new List<int>(seed);
}
public int Next()
{
if (_currentList.Count == 0)
{
throw new ApplicationException("No more numbers");
}
int i = _random.Next(_currentList.Count);
int result = _currentList[i];
_currentList.RemoveAt(i);
return result;
}
}
create instance from UniqueRandom class and call Next() method in NewNumber() method
UniqueRandom u = new UniqueRandom(Enumerable.Range(0, 10));
private Object thisLock = new Object();
public Random a = new Random(DateTime.Now.Ticks.GetHashCode());
private void NewNumber()
{
lock (thisLock)
{
MyNumber = u.Next();
}
}
The SQL Server Bigint type is equivalent to the C# Int64 (long) type. To generate random long values in C#, look at this question. To ensure that the random number is also unique, add a unique constraint to the relevant datatable column. In case the random value already exists in the database, catch the constraint-violation exception, and try again with a new random value.
Regarding how to use the Random class correctly in a multithreaded environment, check out this question: Is C# Random Number Generator thread safe?
I have a collection which is a permutation of two unique orders, where OrderId is unique. Thus it contains the Order1 (Id = 1) and Order2 (Id = 2) as both 12 and 21. Now while processing a routing algorithm, few conditions are checked and while a combination is included in the final result, then its reverse has to be ignored and needn't be considered for processing. Now since the Id is an integer, I have created a following logic:
private static int GetPairKey(int firstOrderId, int secondOrderId)
{
var orderCombinationType = (firstOrderId < secondOrderId)
? new {max = secondOrderId, min = firstOrderId}
: new { max = firstOrderId, min = secondOrderId };
return (orderCombinationType.min.GetHashCode() ^ orderCombinationType.max.GetHashCode());
}
In the logic, I create a Dictionary<int,int>, where key is created using the method GetPairKey shown above, where I ensure that out of given combination they are arranged correctly, so that I get the same Hashcode, which can be inserted and checked for an entry in a Dictionary, while its value is dummy and its ignored.
However above logic seems to have a flaw and it doesn't work as expected for all the logic processing, what am I doing wrong in this case, shall I try something different to create a Hashcode. Is something like following code a better choice, please suggest
Tuple.Create(minOrderId,maxOrderId).GetHashCode, following is relevant code usage:
foreach (var pair in localSavingPairs)
{
var firstOrder = pair.FirstOrder;
var secondOrder = pair.SecondOrder;
if (processedOrderDictionary.ContainsKey(GetPairKey(firstOrder.Id, secondOrder.Id))) continue;
Adding to the Dictionary, is the following code:
processedOrderDictionary.Add(GetPairKey(firstOrder.Id, secondOrder.Id), 0); here the value 0 is dummy and is not used
You need a value that can uniquely represent every possible value.
That is different to a hash-code.
You could uniquely represent each value with a long or with a class or struct that contains all of the appropriate values. Since after a certain total size using long won't work any more, let's look at the other approach, which is more flexible and more extensible:
public class KeyPair : IEquatable<KeyPair>
{
public int Min { get; private set; }
public int Max { get; private set; }
public KeyPair(int first, int second)
{
if (first < second)
{
Min = first;
Max = second;
}
else
{
Min = second;
Max = first;
}
}
public bool Equals(KeyPair other)
{
return other != null && other.Min == Min && other.Max == Max;
}
public override bool Equals(object other)
{
return Equals(other as KeyPair);
}
public override int GetHashCode()
{
return unchecked(Max * 31 + Min);
}
}
Now, the GetHashCode() here will not be unique, but the KeyPair itself will be. Ideally the hashcodes will be very different to each other to better distribute these objects, but doing much better than the above depends on information about the actual values that will be seen in practice.
The dictionary will use that to find the item, but it will also use Equals to pick between those where the hash code is the same.
(You can experiment with this by having a version for which GetHashCode() always just returns 0. It will have very poor performance because collisions hurt performance and this will always collide, but it will still work).
First, 42.GetHashCode() returns 42. Second, 1 ^ 2 is identical to 2 ^ 1, so there's really no point in sorting numbers. Third, your "hash" function is very weak and produces a lot of collisions, which is why you're observing the flaws.
There are two options I can think of right now:
Use a slightly "stronger" hash function
Replace your Dictionary<int, int> key with Dictionary<string, int> with keys being your two sorted numbers separated by whatever character you prever -- e.g. 56-6472
Given that XOR is commutative (so (a ^ b) will always be the same as (b ^ a)) it seems to me that your ordering is misguided... I'd just
(new {firstOrderId, secondOrderId}).GetHashCode()
.Net will fix you up a good well-distributed hashing implementation for anonymous types.
I am new and C#. i want to automatically generate a unique number inside a text box which i can use as a reference number to a form that does asset registration. this reference number will be used as a unique identifier to each asset registered and also given to the asset owner for reference's sake.
To do this, you can use a Guid (globally unique identifier) The chance that the value of the new Guid will be all zeros or equal to any other Guid is very low.
public static void Main()
{
Guid g = Guid.NewGuid();
Console.WriteLine(g);
}
You cand find more about this here:
http://msdn.microsoft.com/en-us/library/system.guid.newguid(v=vs.110).aspx
Have you considered using the GUID's they are pretty easy to generate and reasonably unique?
// This code example demonstrates the Guid.NewGuid() method.
using System;
class Sample
{
public static void Main()
{
Guid g;
// Create and display the value of two GUIDs.
g = Guid.NewGuid();
Console.WriteLine(g);
Console.WriteLine(Guid.NewGuid());
}
}
/*
This code example produces the following results:
0f8fad5b-d9cb-469f-a165-70867728950e
7c9e6679-7425-40de-944b-e07fc1f90ae7
*/
You can use a Guid.
Guid temp;
temp = Guid.NewGuid();
textBox1.Text = temp.ToString().Replace("-", "");
But be aware. A real uniqe number generation is impossible.
There are other ways like the Random class
You can use TimeStamp along with the new GUID.
string uniqueKey = string.Concat(DateTime.Now.ToString("yyyyMMddHHmmssf"), Guid.NewGuid().ToString());
If you really need a number intead of string as a unique key then you can use only time stamp with following stratergy.then it will unique with any given time,Lock to ensure that no two threads run your code at the same time. Thread.Sleep to ensure that you get two distinct times at the tenth of second.
static object lockerObject = new object();
static string GetUniqueKey()
{
lock (lockerObject)
{
return DateTime.Now.ToString("yyyyMMddHHmmssf");
Thread.Sleep(100);
}
}
Or i found a way to do it without time stamp from here as follows.
public long GetUniqueKey()
{
byte[] buffer = Guid.NewGuid().ToByteArray();
return BitConverter.ToInt64(buffer, 0);
}
am started testing hash function on the uniqueness of the generated HashCodes with my algorithm. And i wrote next text class to test when the same hashCode will be generated.
class Program
{
static void Main(string[] args)
{
var hashes = new List<int>();
for (int i = 0; i < 100000; i++)
{
var vol = new Volume();
var code = vol.GetHashCode();
if (!hashes.Contains(code))
{
hashes.Add(code);
}
else
{
Console.WriteLine("Same hash code generated on the {0} retry", hashes.Count());
}
}
}
}
public class Volume
{
public Guid DriverId = Guid.NewGuid();
public Guid ComputerId = Guid.NewGuid();
public int Size;
public ulong VersionNumber;
public int HashCode;
public static ulong CurDriverEpochNumber;
public static Random RandomF = new Random();
public Volume()
{
Size = RandomF.Next(1000000, 1200000);
CurDriverEpochNumber ++;
VersionNumber = CurDriverEpochNumber;
HashCode = GetHashCodeInternal();
}
public int GetHashCodeInternal()
{
unchecked
{
var one = DriverId.GetHashCode() + ComputerId.GetHashCode() * 22;
var two = (ulong)Size + VersionNumber;
var result = one ^ (int)two;
return result;
}
}
}
GUIDs fields DriverId, ComputerId and int Size are random.
I assumed that at some time we will generate the same hash-code. You know it will break work with big collections. Magic was in fact that the retry number when the duplicated
hash code is generated are the same! I run sample code for several time and got near the same result: firs run duplicate on 10170 retry, second on 7628, third 7628
and again and again on 7628. Some times i got a little bit others results. Bu in most cases it was on 7628.
It has no explanations for me.
Is it error in . NET random generator or what?
Thanks all. Now it is clear the was bug in my code (Matthew Watson). I had to call GetHashCodeIntelrnal() and not GetHashCode(). The best GetHashCode unique results gave me:
public int GetHashCodeInternal()
{
unchecked
{
var one = DriverId.GetHashCode() + ComputerId.GetHashCode();
var two = ((ulong)Size) + VersionNumber;
var result = one ^ (int)two << 32;
return result;
}
}
Bu still on near 140 000 it give same code... i think it is not good because ve have collections near 10 000...
If you change your Console.WriteLine() to also print Volume.Size like so:
Console.WriteLine("Same hash code generated on the {0} retry ({1})", hashes.Count, vol.Size);
you will see that although hashes.Count is always the same for the first collision, vol.Size is usually different.
This seems to rule out the random number generator causing this issue - it looks like some strange property of GetHashCodeInternal().
Closer inspection reveals that you are calling the wrong hash code function.
This line: var code = vol.GetHashCode();
Should be: var code = vol.HashCode;
Try that instead! Because at the moment you are calling the default .Net GetHashCode() which is not doing what you want at all.
You will need to pass in the random number generator, having created a single one to be reused, as currently you're creating new instances of them too close together which results in the same seed being used, and hence the same sequence of numbers coming out.
Your results will randomly come out seemingly random at points where the seed is generated from the next ticks/seconds of the seed date. So, just incidental, really.
The classes:
public class SomeCollection
{
public void IteratorReset()
{
index = -1;
}
public bool IteratorNext()
{
index++;
return index < Count;
}
public int Count
{
get
{
return floatCollection.Count;
}
}
public float CurrentValue
{
get
{
return floatCollection[index];
}
}
public int CurrentIndex
{
get
{
return intCollection[index];
}
}
}
Class that holds reference to 'SomeCollection':
public class ThreadUnsafeClass
{
public SomeCollection CollectionObj
{
get
{
return collectionObj;
}
}
}
Classes ClassA, ClassB and ClassC contain the following loop that iterates over CollectionObj:
for (threadUnsafeClass.CollectionObj.IteratorReset(); threadUnsafeClass.CollectionObj.IteratorNext(); )
{
int currentIntIndex = threadUnsafeClass.CollectionObj.CurrentIndex;
float currentfloatValue = threadUnsafeClass.CollectionObj.CurrentValue;
// ...
}
Since I'm only reading CollectionObj in the 3 classes, I'm using multithreading for speedup, but I'm not quite sure how to enforce thread safety. I added a lock in ThreadUnsafeClass when retrieving CollectionObj, but the application throws an out of range exception.
Any help is appreciated.
Thank you !
You're only reading the CollectionObj property, but then you're mutating the object that the value refers to. See this bit:
for (threadUnsafeClass.CollectionObj.IteratorReset();
threadUnsafeClass.CollectionObj.IteratorNext(); )
Both IteratorReset and IteratorNext mutate SomeCollection by changing the value of index. Basically, you can't do this safely with your current code. Several threads could all call IteratorNext() at the same time, for example. The first call returns true, but before that thread gets a chance to read the values, the other threads make the index invalid.
Why are you using the collection itself for iteration? Typically you'd implement IEnumerable<T> and return a new object in GetEnumerator. That way different threads could each get a different object representing "their" cursor over the same collection. They could all iterate over it, and all see all the values.
The SomeCollection object is being referenced by each of the three classes A,B, and C, each of which is going to try and increment the internal index, causing the error(s). That said, you should be able to read objects in an array from multiple threads with something like the following:
public static object[] sharedList = new object[]{1,2,3,4,5};
public void Worker()
{
int localSum=0;
for(int i=0; i<sharedList.length; i++){
localSum += (int)sharedList[i];
}
}
The important thing here is that each thread will maintain it's own location within the array, unlike with the collectionObj.
Locking the CollectionObj property won't help. One possible problem is that all 3 threads are calling IteratorReset(), which sets the index to -1. Imagine the scenario where A starts the for loop, and gets to the first line in the loop before getting interrupted. Now B comes in and calls IteratorReset(), then gets interrupted to let A run again. Thread A executes the CurrentIndex property, which internally uses index = -1 due to B running. Boom, out of range exception.
There are other ways this can generate bad results, but that's probably the easiest to see. Is the intention to have all three threads go through each item on their own? Or are you expecting A, B, and C to divide up the work (like a consumer queue)?